Skip to main content

Slashdot: Hugging Face Clones OpenAI's Deep Research In 24 Hours

Hugging Face Clones OpenAI's Deep Research In 24 Hours
Published on February 07, 2025 at 02:38AM
An anonymous reader quotes a report from Ars Technica: On Tuesday, Hugging Face researchers released an open source AI research agent called "Open Deep Research," created by an in-house team as a challenge 24 hours after the launch of OpenAI's Deep Research feature, which can autonomously browse the web and create research reports. The project seeks to match Deep Research's performance while making the technology freely available to developers. "While powerful LLMs are now freely available in open-source, OpenAI didn't disclose much about the agentic framework underlying Deep Research," writes Hugging Face on its announcement page. "So we decided to embark on a 24-hour mission to reproduce their results and open-source the needed framework along the way!" Similar to both OpenAI's Deep Research and Google's implementation of its own "Deep Research" using Gemini (first introduced in December -- before OpenAI), Hugging Face's solution adds an "agent" framework to an existing AI model to allow it to perform multi-step tasks, such as collecting information and building the report as it goes along that it presents to the user at the end. The open source clone is already racking up comparable benchmark results. After only a day's work, Hugging Face's Open Deep Research has reached 55.15 percent accuracy on the General AI Assistants (GAIA) benchmark, which tests an AI model's ability to gather and synthesize information from multiple sources. OpenAI's Deep Research scored 67.36 percent accuracy on the same benchmark with a single-pass response (OpenAI's score went up to 72.57 percent when 64 responses were combined using a consensus mechanism). As Hugging Face points out in its post, GAIA includes complex multi-step questions such as this one: "Which of the fruits shown in the 2008 painting 'Embroidery from Uzbekistan' were served as part of the October 1949 breakfast menu for the ocean liner that was later used as a floating prop for the film 'The Last Voyage'? Give the items as a comma-separated list, ordering them in clockwise order based on their arrangement in the painting starting from the 12 o'clock position. Use the plural form of each fruit." To correctly answer that type of question, the AI agent must seek out multiple disparate sources and assemble them into a coherent answer. Many of the questions in GAIA represent no easy task, even for a human, so they test agentic AI's mettle quite well. Open Deep Research "builds on OpenAI's large language models (such as GPT-4o) or simulated reasoning models (such as o1 and o3-mini) through an API," notes Ars. "But it can also be adapted to open-weights AI models. The novel part here is the agentic structure that holds it all together and allows an AI language model to autonomously complete a research task." The code has been made public on GitHub.

Read more of this story at Slashdot.

Comments

Popular posts from this blog

Slashdot: AT&T Says Leaked Data of 70 Million People Is Not From Its Systems

AT&T Says Leaked Data of 70 Million People Is Not From Its Systems Published on March 20, 2024 at 02:15AM An anonymous reader quotes a report from BleepingComputer: AT&T says a massive trove of data impacting 71 million people did not originate from its systems after a hacker leaked it on a cybercrime forum and claimed it was stolen in a 2021 breach of the company. While BleepingComputer has not been able to confirm the legitimacy of all the data in the database, we have confirmed some of the entries are accurate, including those whose data is not publicly accessible for scraping. The data is from an alleged 2021 AT&T data breach that a threat actor known as ShinyHunters attempted to sell on the RaidForums data theft forum for a starting price of $200,000 and incremental offers of $30,000. The hacker stated they would sell it immediately for $1 million. AT&T told BleepingComputer then that the data did not originate from them and that its systems were not breached. ...

Slashdot: US Plans $825 Million Investment For New York Semiconductor R&D Facility

US Plans $825 Million Investment For New York Semiconductor R&D Facility Published on November 02, 2024 at 03:00AM The Biden administration is investing $825 million in a new semiconductor research and development facility in Albany, New York. Reuters reports: The New York facility will be expected to drive innovation in EUV technology, a complex process necessary to make semiconductors, the U.S. Department of Commerce and Natcast, operator of the National Semiconductor Technology Center (NTSC) said. The launch of the facility "represents a key milestone in ensuring the United States remains a global leader in innovation and semiconductor research and development," Commerce Secretary Gina Raimondo said. From the U.S. Department of Commerce press release: EUV Lithography is essential for manufacturing smaller, faster, and more efficient microchips. As the semiconductor industry pushes the limits of Moore's Law, EUV lithography has emerged as a critical technology to ...

Slashdot: AT&T, T-Mobile Prep First RedCap 5G IoT Devices

AT&T, T-Mobile Prep First RedCap 5G IoT Devices Published on October 15, 2024 at 03:20AM The first 5G Internet of Things (IoT) devices are launching soon. According to Fierce Wireless, T-Mobile plans to launch its first RedCap devices by the end of the year, while AT&T's devices are expected sometime in 2025. From the report: All of this should pave the way for higher performance 5G gadgets to make an impact in the world of IoT. RedCap, which stands for reduced capabilities, was introduced as part of the 3GPP's Release 17 5G standard, which was completed -- or frozen in 3GPP terms -- in mid-2022. The specification, which is also called NR-Light, is the first 5G-specific spec for IoT. RedCap promises to offer data transfer speeds of between 30 Mbps to 80 Mbps. The RedCap spec greatly reduces the bandwidth needed for 5G, allowing the signal to run in a 20 MHz channel rather than the 100 MHz channel required for full scale 5G communications. Read more of this story at...