Hi Ashvin,AshvinP wrote: ↑Wed Jan 29, 2025 4:29 pm I also just came across this video by Angus on DeepSeek AI and its potentially redemptive uses. Cleric, have you had a chance to experiment with this new AI yet? I posted a response about what I think is going on, based on our discussions here, but it would be interesting to hear any additional thoughts or whether I am missing something.
I think your comment to Angus was pretty concise, did you get any response?
The fuss around DeepSeek seems to be more political than technical

I admit that I use AI quite regularly. In the IT sphere it is useful (I don't use it for writing code but for getting some information which would otherwise take too long to look for in manuals, forums, and QA sites). I think I have mentioned that before, but at present I see it as a sort of the next-gen search engine.
The idea of indexing is quite old. I don't know when exactly it became customary, but even today some books have an index at the end where we can look for a word and see on which pages it appears. Today with computers this indexing is everywhere. Search engines are basically indexers - they map keywords to URLs.
There are two aspects to indices. First, they need to be ordered. For example, if the book index at the end didn't list the words in alphabetic order but randomly, it would be very tedious to find the word we need. We would need to go through them one by one. When there's order, we can find what we need much more quickly through bisecting. This is basically how we find a word in the dictionary - we open it somewhere and see whether we need to go back or forward. Then we see if we have overshot and need to return (but not so much that we go past our initial page), and so on. The second aspect is the reference. In computers the thing we search with (the word in the book example) is the 'key' while what stands against it is the 'value'.
Another thing to mention is filtering. This is, for example, when we have an Excel sheet with many rows and we put some criteria - column D should be between 100 and 200, column F should be so and so, etc. The naive way for filtering is by going row by row and discarding those that don't satisfy the condition. This is what Excel generally does. But in databases (which are akin to Excel sheets with fixed column count and type), where there could be millions of rows, such one-by-one testing is very slow. Instead, there could be indices, for example, on column D, and then we can easily narrow down from there. The index stores ordered values and against each value it lists the row numbers where this value occurs. Just like looking through the book's index, we can easily extract all keys between 100 and 200 (because they are ordered) and take the corresponding row numbers.
In LMs things are more convoluted because we do not have such separation between keys and values. Instead, we may say that the whole concatenated sequence of tokens (the prompt + any hidden context) is like a key that retrieves the next token (value). Then the previous sequence + the value becomes the new key, and so on.
Speaking of redemption of language and technology, I can say that when I use LMs I find myself thinking harder when writing the prompt. This is understandable if we imagine that the better we supply the filter criteria in the key, the more precise the value may come out. It reminds me of Whitehead's negative questions. For example, before writing the prompt, we can imagine that the possible output value is like white light, it could be anything. Then with each additional token in the prompt, some of the spectral components are blotted out, or shifted around, gradually narrowing down the most fitting value. I notice that I'm much more conscious of this process when I try to write the prompt. Of course, I'm not saying that what I do matches the workings of the LM, but nevertheless, I find myself being very careful to pick the proper words in order to triangulate what I'm looking for and avoid ambiguity. And this is not limited to LMs only. Basically the same thing holds when we communicate with people, except that the key here is not merely a sequence of tokens. Even supersensible factors play out into what ideal state the other person will land into.
So once again, we can see that something of value can be extracted. Not so much of the answers that the LMs give. They can be considered an automation. Just like using the book's index saves us from reading through its entirety in search of a word, so asking LMs things can save us from browsing and researching. But by trying to understand the process, at least for me, it certainly stimulates thinking. If nothing else, it at least highlights the old saying "Half of the answer is contained in the properly formulated question." For example, when I've been playing with image creation, I realized how often I want to generate something that I have only the vaguest idea about (usually some dim Imagination that I can't yet get into form). But if I can't describe what I want, how could the model guess it (where there are no supersensible factors, everything must come entirely from the sequence of tokens)? This actually forces me to think and in the process the picture in my imagination also becomes clearer. This in another way hints at the tight connection between pictorial and verbal thinking. The pictures become more vivid as they become pregnant with all the potential ways in which they can be described.