Linus Lee Is Living With AI

How a researcher uses generative AI to help him think better and get more done

December 3, 2022

AI-researcher Linus Lee

Linus introduces himself

I'm an independent researcher focusing on building better interfaces for humans to interact with generative models. In particular, I’m interested in interfaces that allow for direct manipulation of text using latent space based language models. Basically, what that means is that I’m interested in exploring ways to use language models that look less like using prompts and more like familiar user interfaces like pinch-to-zoom or drag-and-drop.

Before becoming a researcher, I was your average React-TypeScript-wielding product engineer. But I always had a side hobby and interest in building knowledge tools to help people learn and read quickly and things like that. At the start of this year, I decided to take a year off from work to dive deeper into that question.

I spent the first half of the year looking at classic natural language processing (NLP) approaches to that question. Since May or June, I've been looking at more language model-based approaches.

How he does his research

I research generative AI, but I also use a lot of generative AI tools in my workflow. Some of those tools are ones I’ve built myself—I have an array of personal micro-tools that support what I do. I also use a few publicly available AI tools. The ones I want to share today break down into two big categories: tools for reading, and tools for personal knowledge management—like notes and search.

We’ll start with tools for reading.

Tools for reading

To find papers to read I use Elicit

A big part of research is figuring out what to read. It’s important to be able to do literature reviews and find papers that are relevant to the question you’re interested in.

Elicit is a literature review tool that uses language models to aid in people searching for what to read to answer a question. I might have a question like, "How does the depth of a transformer model affect its language modeling performance?" This is a fairly technical question, but it’s basically asking, "How does one part of the model’s structure affect its performance?"

Normally if I had this question I would Google something like, "language model depth paper." Then I have to look at each link to figure out: does this answer the question? Who is it by? Is it a reputable paper? What are the results?

It’s a time-intensive process. But Elicit changes that. If I ask a question in Elicit, it churns on it for a little bit and then outputs a list of relevant papers:

(A screenshot of Elicit)

The papers it finds are great. it also outputs a summary of the abstract, so I can quickly tell if the paper is going to answer the question that I have. It has filters for things like the number of participants in the study or what the intervention was in the study, which is relevant for social science papers.

This is a good way to just discover new papers, especially because there are paper links that you see on Twitter, but they're so biased towards the last couple of months, and this gets you historical papers that you might have missed.

ExplainPaper helps me understand dense papers

A new tool I’ve been using to help me read papers is called ExplainPaper. If you put in a paper, you can highlight any section of that paper and the tool will explain it to you.

This is useful for me because I don't come from a machine learning background, but I spend a lot of time reading papers to do the research I’m interested in.

Sometimes there are papers that are clearly written by engineers and other papers are clearly written by mathematicians. The ones that are written by mathematicians are just hard to wrap your mind around if you don't come from that world because they're a little bit too dense.

In ExplainPaper I can highlight a sentence that talks about an equation or a rule, and it will summarize and explain something that might be unclear:

(A screenshot of ExplainPaper)

I built a visual summarization tool for scanning through articles quickly

I read a lot, and sometimes I want to be able to get the most important parts of an article without having to read the whole thing. The obvious solution to this is a summarizer—but summaries have lots of limitations.

A summary is taking one wall of text and trying to get into a smaller wall of text. But you still have to scan through the summary. If you want to expand a specific point in the summary, you have to go back to the original text.

So, instead, I built a tool called CoStructure. It creates a visual heat map on top of an article to show me the most important sentences. So I can quickly jump to what’s most important without having to read the whole article.

I create the heatmaps with what’s called extractive summarization, which tries to identify sentences in the text that are most representative of the text as a whole by ranking every sentence in a text according to how many other sentences are similar to it. So if one sentence contains lots of topics that many other sentences mention, it ranks higher and is probably more central to the topic of the post.

Let’s say I’m reading an article and want to be able to skim through the most important points. I paste it into CoStructure like this:

(A screenshot of CoStructure)

CoStructure will highlight the most important sentences of the article for me in yellow, so that it’s easy to scan. If I'm interested in one part, I can click on that sentence and see what other sentences are most related to it:

(A screenshot of CoStructure)

It's a different way of exploring a text from the top down instead of from the start to the end.

Text is the most ubiquitous and the least user-friendly interface to information that we have. We have plots, we have graphs, we have tables. All of these are great. They are optimized for various different kinds of things for different uses.

Text is not optimized for anything at all. It's just a thing that we inherited from 3,000 years ago, and we haven't really improved it in any really interesting ways with technology. So the idea behind this was what if you treat text seriously as a medium for data visualization, and you try to take the understanding of text that computers can have now and overlay it on top of text to make it a more effective data visualization or data graphic.

Personal knowledge management tools

I take notes in a custom app called Notation

Another tool of my own creation is a notes app I call Notation. It’s my daily driver notes app right now, and it’s an outliner tool like Roam or Logseq. It’s a pretty typical outliner app, except it also has a machine learning angle to it.

If I open a note, you’ll see words are highlighted, and some are highlighted more darkly than others:

(A screenshot of Notation)

If I click a highlight, it will show me all of the other times I’ve mentioned this idea sorted by relevance:

(A screenshot of Notation)

The highlighting happens automatically by looking for other places in my notes archive where that word also appears. If it appears frequently, and if those other notes are closely related to my current note, the highlight gets bolder.

I built this as a response to the idea that no one should have to double-bracket every note that they take. (Editor’s note: see "The Fall of Roam" for more about double-bracketing notes.) I wanted to know what would happen if I built an app where you couldn’t double-bracket, and instead you had to rely on the model to do it.

My conclusion is that there are definitely corner cases where it’s useful to double-bracket, but for 90% of the cases, it’s possible to do it automatically by detecting semantic similarity.

I built two personal search engines to help me find all of my data

I have two search engines, Monocle and Revery. They search over an index of all of my data—journal entries, notes, contacts, tweets, bookmarks, and more.

Monocle is a full text search engine, which makes it useful for finding proper nouns and specific keywords. For example, if I want to know, "Have I met this person?" I can search their name in it. Here’s what happens if I search Taylor Swift:

(A screenshot of Monocle)

This type of search is also useful for personal trivia like airline rewards account numbers, discount codes, and restaurants with public restrooms.

Revery does semantic search—search based on similar meaning—which is more useful for finding out what I know about a topic. I use it as a Chrome extension: if I click a button, it will pull up results in a sidebar with related notes or websites I’ve seen:

(A screenshot of Revery)

A research example

The output of all of the tools I use in my workflow is the research results I generate. I’ll show you a few demos.

To grok these demos, it’s useful to have a working definition of what I mean by "latent spaces" and "latent space language models." You can think of a latent space as a way to map the space of every possible sentence that a model could write. In a latent space, similar sentences are "close" to one another. These models are quite good at finding, based on an input, the parts of the latent space of all possible sentences that the output is likely to live in.

The question I was interested in answering with this demo is: can you build a language model with a UI that lets you move around in the latent space? And where movement in certain directions can make meaningful changes to the text?

(A screenshot of Clay)

In this demo we have an example sentence about a couple going on a honeymoon to Greece. Then we get this rainbow colored pad (see screenshot) on the right with an X axis and a Y axis. Each axis represents another sentence—maybe the X axis is a sentence about black holes, and the Y axis is a sentence about a financial crisis.

If I drag this little dot around in space, it will modify the original sentence to be more or less about black holes. Or more or less about financial crises. The further I drag the further it goes:

(A screenshot of Clay)

Another example demo is called Protopolis. Again, it starts with a sentence:

(A screenshot of Protopolis)

In this example, I have two sentences, "We went to the supermarket to buy some apples this morning. We were planning to make some delicious apple pie later."

I can choose an attribute direction, like "length."

(A screenshot of Protopolis)

As I change the coefficient, it will make the text longer while preserving the original meaning.

I think these kinds of interfaces are going to be extremely important for AI to get more general adoption. Prompting models are still a kind of dark art, and for generative models to live up to their potential, humans are going to need much more precise and predictable control over their output. These interfaces are a glimpse into a future where we'll be able to control outputs from these models more directly and more precisely. It'll enable interactions like "pinch in to summarize this text," or rotating a dial to specify exactly how formal or blunt you want to sound in your email. I'm excited about this research direction because it opens up new ways of manipulating text directly in "meaning space" rather than just by typing paragraphs word-by-word into a text box.

Book recommendations

Diaspora by Greg Egan and Exhalation by Ted Chiang are the two books that have influenced me most deeply that I recommend to friends over and over again. They’ve changed the way I imagine the future of humans and technology and, by extension, deeply impacted the work I do with AI and text interfaces.

Diaspora is a science fiction novel about a far-off future where humanity spans our solar system, diverging into a myriad of genetically mutated biological forms and computer-enabled virtualized forms in the process. But the book isn’t really about space exploration or the manifest destiny of civilization. It follows a protagonist’s pursuit of the basic, eternal questions: What does it mean to be human? What is knowledge? Is there something beyond this universe? In the process, the novel blitzes through a tour de force of some of the most interesting ideas about future technologies, literacy and language, and living in simulation that I’ve ever come across, many of which directly inspire my work.

Exhalation is a collection of short stories by the writer Ted Chiang, whom you might know better as the author of the story that inspired the film Arrival. My best endorsement for this work is that it single-handedly convinced me, a devout writer of blogs and nonfiction essays, to try my hand at writing short science fiction stories myself. His stories are nominally about science and technology but take place in such fantastical alternate timelines that reading them often lets me peer back into our current reality through new lenses, as an observer from another world.

Do you build an AI tool or do AI research? Do you use AI to do your work?

If yes, I'd love to interview you for a future Superorganizers edition. Just hit reply to reach me!

What did you think of this post?

Amazing Good Meh Bad

Like this?
Become a subscriber.

Subscribe →

Or, learn more.

Thanks to our Sponsor: Tweet Hunter

Thank you to this week’s sponsor, Tweet Hunter, the tool for driving audience growth in just 5 minutes per day.

Try it for Free

Want to hide ads? Become a subscriber

Read this next:

Superorganizers

The Four Kinds of Side Hustles

The CEO of Kettle and Fire breaks down how
he thinks about side business opportunities

♥ 407 Sep 16, 2020 by Justin Mares

Superorganizers

The Double Life of Productivity’s
Most Famous Doctor

YouTuber Ali Abdaal shares how he makes the
productivity videos that have netted him 1M subscribers

♥ 136 Sep 24, 2020 by Dan Shipper

Superorganizers

How to Make Yourself
Into a Learning Machine

Shopify’s Director of Production
Engineering explains how reading broadly
helps him get to the bottom of things

♥ 165 Mar 3, 2020

Napkin Math

Uber and Lyft Live Once Again

What drivers think of AB 5, why it was written,
and the pricing differences between services

♥ 1 Aug 28, 2020 by Adam Keesling

How to Balance Meaning and Money

What the science says about your
relationship with what you do

♥ 78 Jan 3, 2023 by Simone Stolzoff

Comments

@jeremy.klammer about 1 month ago

Hi Dan - very interesting article. Are the tools that you mention which are not hyperlinked available publicly? The Revery tool peeked my interest in particular. Thanks for the post!

♡ 1 · Reply

Linus Lee Is Living With AI

Sponsored By: Tweet Hunter

Linus introduces himself

How he does his research

Tools for reading

To find papers to read I use Elicit

ExplainPaper helps me understand dense papers

I built a visual summarization tool for scanning through articles quickly

Personal knowledge management tools

I take notes in a custom app called Notation

I built two personal search engines to help me find all of my data

A research example

Book recommendations

What did you think of this post?

Like this?Become a subscriber.

Thanks to our Sponsor: Tweet Hunter

Read this next:

The Four Kinds of Side Hustles

The Double Life of Productivity’sMost Famous Doctor

How to Make YourselfInto a Learning Machine

Uber and Lyft Live Once Again

How to Balance Meaning and Money

Comments

Like this?
Become a subscriber.

The Double Life of Productivity’s
Most Famous Doctor

How to Make Yourself
Into a Learning Machine