Search spaces. Scrabble. Board Evaluation. Memory

What did we learn in 14 days of computational thinking.

Jawwad Ahmed Farid
10 min readSep 4, 2023

A note for ITC and ICA students at IBA, Karachi. On Scrabble and all that it taught us over two weekends.

The look ahead tree for Scrabble AI engine and all that it taught us this week in computational thinking.

The path, the journey is the destination. You learnt more parsing and processing pre-releases than within the actual quiz itself. The prep for the quiz is the quiz, not the quiz.

When it comes to computing and processing power, processing power alone will take you so far. You would be surprised how far you can go with decent memory. Or how quickly Excel crashes when you run out of it.

If you think you don’t need to write efficient code, or don’t need others to write it for you, you haven’t met the right search problem or run out of RAM on a 32 Gig machine.

As we add dimensions, the search space grows at combinatorial speed (a fancy way of saying really fast). At the same time yield that tracks effective solutions found as a result of the same search, goes down. In our case the search space was all possible letter combinations using a slate of tiles. The yield, validated words against the Oxford and Collins dictionary.

The secret of efficient search is to learn to look in the right places. If you find the right starting point, search is faster. Fish, where the fishes are. Indexing and heuristics are fishing where the fish are.

In order to get to faster search you have to learn to speak in specifics. Specify what you need and what you are searching for in clear, precise and specific terms. Specify what the fish looks like for you. If you don’t know what you are looking for its easy to get lost, lose track or go around in circles.

Enumerating a problem space is a counting problem. Defining a search space, counting problem. Paths that lead to a workable solution, also a counting problem. Finding paths and solutions is a search problem.

Counting and searching are two key skills in computer science. While defining problem spaces is a counting problem, finding solutions paths leading to acceptable solutions within those spaces is a search problem. Together the two represent the heart of the subject and the work we do every day. If you want to stand out as a CS professional or go far in the field, get good at both.

Before search and counting, we spent a lot of time agonizing over the E.coli genome versus Linux function call graph article. Why? Because it makes two key point you would do well to remember. Nature has been programming systems for hundred of millions of years. Robust, stable, workable, sustainable reactions. We could all learn from nature when robustness, stability and uptime is important for the systems we design. The evolution of nature’s programming approach over hundreds of millions of years suggests a hierarchy of change. A hierarchy where core worker functions are specialized and don’t change but managerial functions do.

The E.coli vs Linux Call graph truth table.

Retakes are good for the soul. When offered a retake, take it. Even if you are happy with your score.

If you want to travel fast, walk alone. If you want to travel far, walk together.

Building an competitive engine for Scrabble is harder than building one for Chess. Scrabble is a more complex game to model. More complex because it is essentially Chess with words and adds luck, incomplete information and multi dimensional strategy to the mix.

On memory.

Why is memory important? Because of faster search and response times. Because it is expensive it is also in short supply on your systems compared to slower storage. Which brings us to our next question. Why did Excel crash for so many of you at higher dimensions of the word generation problem space? It did so because you systems ran out of memory and couldn’t allocate any more to Excel when it called for it.

Optimize memory usage. Why? Because no one else does. Do you really need 8 gig of RAM to run a 7 tile letter combination and validate against it the 280,000 strong Collins dictionary? You shouldn’t, but you did. Where did it all go? Everyone else, every other program on your task manager bar, took it. Because by default we assume that we will always have memory. We all do. And because we all do, we run short of it. The best example of this are office products. Microsoft Team is the worst of all memory hogs. It shouldn’t be, but it is.

So when you write code, assume you have limited memory. Assume you will run short. Even when you do have infinite memory, you will run short, so respect memory and use it judiciously. All the processing power in the world is worthless if you run out of memory.

Turns out memory and how it is organized isn’t just a factor in computers, it is the key differentiator that segregates the ordinary from the extra-ordinary.

Defining problems.

To solve a problem you must first define it. That is harder than you think. What was the count of your raw output of letter combinations for a 7 letter slate? What do you think the count would be for a 15 letter slate? Do you really need to search for 15 letter words? Can you think of any scenario where you will need them?

How about Scrabble board evaluation? What is the right way of defining this problem. Information compression? Signal detection? State transition? Lookahead simulation? Which one of these paths make the most sense to you and your team?

When defining problem focus on the relevant. Start broad and then put limits and restrictions that would narrow the scope of your search. Important to start broad at the beginning to ensure you don’t miss any important element of the space that you would need to later factor in your design.

Search hacks

How do you find the right place to start a search? Think in terms of what does your optimal solution looks like. Then define attributes that identify the path to that solution. Then search for those attributes. Given a choice of words and anchors which one would you favor? The ones that include the word Z, Q, X, J, K, P, B and M over other alphabets? The ones closes to double and triple letter words. Is ZIPS a better choice than AHEAD, TRADE, BOLD or ALONE? How would you answer change if you were playing your choice over or near a triple word score or triple letter score tile?

Change and Stability. Entropy and Inertia.

Even though the word generation using Excel process was mechanical, repetitive and documented it broke for many of you. Because every time you added a letter to the slate, you introduced change. Every time you moved up a dimension you did it manually, mechanically with your own hands. Change to a system that you understood well enough, but still change. Then the dictionary switch from Oxford wordlist to Collins for validation in the last hour. That really killed things for a lot of you.

Change at the last minute and you will let entropy take charge and run forth with havoc. If you missed the point while reading the E.coli article, this exercise should have reinforced it again.

The wrong way.

Are there better and more efficient ways of solving the word generation challenge? Yes. Why didn’t we use them? You tell me. There is a mechanical hand driven method that is a pain to execute, mined with potential errors and is likely to explode when you need it most. Not really better it doesn’t require you to run power query. So if you didn’t have access to power query, you could use that. Better is relative to what you are solving for and your context.

How do we build an intuition for finding the right way, faster? Do things the wrong way enough and the paths will start talking back to you.

State transitions.

When modeling the Scrabble AI we introduced the concept of a decision tree. The tree started at point zero and then projected future states of the board and the game. It did this by projecting possible combinations of tile placements on the board and the resulting score. Essentially identifying the state of the board and how it changes from one move to the next.

The board transition diagram.

When we place a tile on the board or play our move, the state of the game changes. The move changes the board, player scores, tiles on the board, tiles in the bag, win probability and the path to winning the game.

If we want to win the game, like chess we have to project the board. In order for us to do this correctly we have to summarize and compress all the information available on the board into a single signal. One signal that can traverse back and forth across all projected state and bring the optimal path back to us at time zero, the present. The signal is our time traveler on a search mission. It moves forward and backward in time on the decision tree to find the optimal move.

Searching for the signal. Time traveling across the tree.

Board Evaluation

That signal is the board evaluation algorithm. We started with a simple bare bone model. A model that can be tweaked and replaced by more powerful calculators. Our idea this week was to illustrate a possible way in which a Scrabble board could be evaluated across different stages of time.

At a superficial level any board evaluation algorithm may appear to be a simple linear exercise in modeling. But when you change the lens to evaluating future projected moves with a view of pruning and trimming paths of the decision tree that lead to non-desired outcomes, you change dimensions. This is once again a search problem but we will have more to say about this later on in the course.

As states transition for pre-move to post-move, focus on what is changing, why it’s changing, who is changing it and how does it effect our search for a path to winning the game.

The board evaluation algorithm.

We did that by identifying drivers at play in changing the state of our game.

Stage of the game. Stage of the game as represented by the total tiles already placed on the board. Less tile represent an open board and an early stage. More tiles, less open boards and later stages. The stages require a different strategy and searches in different directions.

Special Squares remaining. To account for leverage that is possible by doubling or tripling scores.

Player competence. As represented by differential in player scores.

Luck. The factor of luck represented by the availability of the letters Z, Q, X, J on the board or in the tile bag.

Remember this is a proposed model. We haven’t tested it as yet. We just wanted to see if it calibrated well with the different stages of the game and if board evaluations scores tracked reasonably with the ability of a player to win a game at a given stage.

We didn’t have enough time to think through test cases for this engine. How would you do that? Boundary conditions. What should you expect to see at the opening of the game, at the end? Does the model do what you expected it to do? If it does, that is great. If it didn’t what do you need to change to fix it.

Dissected.

The actor, behavior, roles framework.

The actors, behavior and role framework and its application in narrating use cases and user stories simplified the explanation of board transitions, state transitions and the board evaluation algorithm. The analogy of the time traveler simplified the presentation of the decision tree concept.

Visual presentations make it easier to explain what we want to do or want done. There is no restriction on how we decide to share these stories. It doesn’t matter what tools or representation you use. What matters is how deep you go when you decide to dissect a scenario, a use case or a design.

State transition diagrams. Remember these 3 words.

Thank you.

It has only been two weeks since the arcs of our lives intersected. I have to tell you how incredibly proud I am of how far you have come in these weeks. And how far I think you will go before our shared journey ends. It’s how readily you have stepped up to challenges we have collectively thrown at you, how quickly you have come together as a batch around your class anchors and how eagerly you opt for retakes. Your class anchors are the stars you go to for help when you are in trouble. Outside of the classrooms these students will become your best source of inspiration and guidance. Value them. Don’t take them for granted. They don’t have to do what they do but we are all grateful for their contribution.

It has always been a privilege to teach you at IBA and it will always be. Thank you for allowing me to be your guide on this path.

The computational thinking at IBA course play list

--

--

Jawwad Ahmed Farid
Jawwad Ahmed Farid

Written by Jawwad Ahmed Farid

Serial has been. 5 books. 6 startups. 1 exit. Professor of Practice, IBA, Karachi. Fellow Society of Actuaries. https://financetrainingcourse.com/education/

No responses yet