Forget the sparkles; the trendy metaphor in the AI world lately is the deep.
For too long, automated research tools have only be able to scratch the surface of the vast ocean of knowledge. If you wanted to dive, to delve into the hidden depths where the true secrets lurk, you needed human researchers, librarians, intrepid explorers of the web. No longer! The big AI labs, as well as many of the smaller companies that specialize in wrapping the work of the big AI labs into a nice UI, have now released Deep Everything. Deep Search, Deep Research, Deep Review, Deep Seek. No matter what you’re looking for, we can only assume that AI would already have given it to you if it weren’t deep. So, strap in, we’re going to use the latest tech to traverse the epipelagic and mesopelagic zones, and hang out with the giant squids and the bioluminescent hatchetfish in the bathypelagic and abyssopelagic. When we swim back up we will, hopefully, surface some profound insight that has never before seen the light of day.

Now, it would be natural to think that this metaphor has something to do with the previous generation of “deep” AI companies and tools. Nothing could be further from the truth. DeepMind, DeepL, DeepFace, DeepSpeech: none of these do deep research; all of then are but a play on deep learning. Here is a phrase that carries much more gravitas than it deserves. For machine learning is deemed deep simply when it uses a deep neural network, and a neural network is deep merely when it contains two or more hidden layers in the middle. A neural network is just a mathematical object that feeds on some input and then spits out some output. The shallow ones content themselves with just one layer of computation in between input and output. The deep networks, instead, have layers of computation that pass off results to others layers of computation. If this sounds boring, or prosaic, that’s because it is. “Deep learning” has revolutionized AI, and inspired all these companies, which is awesome and all, but deep down (pun more or less intended) it’s really just a way to say that there’s more math in the thing than in the earlier things.

Among the latest crop of AI products, the depth refers to something else entirely. ChatGPT and its rebel cousins are made of deep learning, but until recently they didn’t call themselves deep. That would have been cliché. Tired. Instead, they presented themselves as cheerful chatbots, ready to obey your merest wish and delight you with magic. I mentioned the sparkles already. For a time, AI, everywhere, was marketed as generic, almost meaningless, but clearly whimsical ✨ twinkles ✨. It didn’t matter what it did: create a slop image for your blog? pretend it’s your therapist? try really hard to not give you a method to make bombs and send them to your local politician? What mattered is that they could do it, instantly, at almost no cost. Magic, on tap.

But then the users started asking more difficult questions. Factual questions. Questions that could be true or false. The happy magical fairies in the machines had to be very careful here: it would be unseemly to give wrong answers. Fortunately, there was a solution: search the web! All the answers to any question are in there somewhere, on Wikipedia, on some blogs written by weirdoes, on Stack Overflow, on some Reddit thread or other from 2018. This worked, sometimes. But the users, already jaded by all this magic, demanded higher epistemic standards than misinterpreted Reddit threads about eating rocks. They wanted the very best information, not whatever the first Google hit was. The sparkly magical chatbot, to be maximally helpful, would have to look around the web for a while. It would have to try to find trustworthy sources — academic papers, for instance. It would have make a research plan, collect different perspectives, and synthesize everything it found adroitly. It would have to think.
I’m not entirely sure who first came up the deep research metaphor, as distinct from the deep learning one. It may have been Google Gemini, who used the term a while ago, in the prehistoric era called December 2024. But as often happens, it is OpenAI that popularized it, creating a buzz at the beginning of this month with ChatGPT Deep Research. Regardless, the phrase has caused ripples throughout the pelagic zone, catching every other whale and small fish. Some have resisted the buzz, like Anthropic’s Claude. Most of the others could not. The allure of the deep was irresistible. Imagine building a submarine that can barely dive more than a few meters below the surface when you could instead build a bathyscaphe. So now the option is there, almost everywhere. You can stay at shallow depths if you want, if you don’t want to expend the costs of a more involved expedition ($200/month for ChatGPT Pro!). But when you’re ready to descend, you can.

Or can you? How deep do these things really go? Boot up ChatGPT Deep Research or any of the others, ask it a complex question, and it will spend a few minutes scouring the web for sources. How? Unclear. OpenAI helpfully informs us that Deep Research:
learned to plan and execute a multi-step trajectory to find the data it needs, backtracking and reacting to real-time information where necessary. The model is also able to browse over user uploaded files, plot and iterate on graphs using the python tool, embed both generated graphs and images from websites in its responses, and cite specific sentences or passages from its sources.
That’s great. The model can go in the water and spend some time examining the fauna and flora it finds instead of returning just the first curiosity it encounters. An improvement over previous versions, to be sure, and the evaluations it was subjected to certainly show that it can answer difficult exam questions more accurately than any other AI. But none of that says much about depth.

And in fact there isn’t really any evidence that anyone is using these tools to truly dive deep. OpenAI’s Deep Research “pretends to use high-quality sources but in practice seems to just browse the front page of Google”, says one user. “If you want to survey a lot of information at once, you want Deep Research,” says another. I feel compelled to point out that “survey a lot of information at once” sounds like the very opposite of anything deep. Breadth-first search instead of depth-first. OpenAI’s Deep Research, like Perplexity’s and Google Gemini’s and Grok’s and so on, is excellent at gathering information from everywhere on the web at record speed. At Elicit, where I work, we have just released our version of this — which by the way is the best on the market — using only academic papers and a customizable, transparent process to get the most relevant ones. But we don’t call it deep, because it isn’t.

What actually is deep research? What is the web equivalent of mounting an expedition to go observe the colossal squid in its natural environment? I can imagine many answers. Deep research means falling into a Wikipedia rabbit hole, initially wanting to look up the date of the invention of writing in Sumeria and ending up reading about the debates on creating the metric system in Revolutionary France. It means opening a dozen threads from pre-2018 Reddit, realizing that crucial comments were deleted, and messaging the other users in the discussion to see if they remember anything, and receiving no reply. It means hunting down a quote in the digitized archives of a 17th-century book in which you can’t use cmd-f because the optical character recognition algorithm doesn’t know that an ‘ſ’ is the same as an ‘s’. It means coming to the realization that the answer to your question is in a scientific paper cannot be found on sci-hub nor on any website indexed by a search engine, and actually driving to the local university library to read it. It means downloading a huge and barely readable dataset from some NGO that takes five minutes to open in Excel. It means emailing a government employee you once met to obtain a PDF copy of some grey literature report that has never been uploaded anywhere. It means groping in the dark, getting your hopes up at the glimpse of the tiniest bioluminescent insight, and constantly hoping that you won’t be devoured by some predator hitherto unknown to mankind.

Despite our best efforts, AI cannot do this, not yet. I may be biased, but I think Elicit is well positioned to make the most headway towards the true depths in the coming year. Surely all the other companies are going to try their best, too. So perhaps AI search tools will soon be able to help us actually dive towards the true, dark depths. Until then, there’s plenty for AI to find in the shallow coral reefs and intertidal zones of the internet, and I do suggest you take advantage of it. Just don’t buy the hype: AI search is no more “deep” than it is magical. It’s useful, and that’s good enough.
Je suis beaucoup plus intriguée par la dernière image que par tout le reste de l’article, quel est le lien avec le travail culinaire?🦑🍳
Je ne ferais pas de recherche, j’apprends à tolérer l’incertitude et l’ignorance de la surface puisque je ne sais pas nager 🙈🙉🙊