News Gist .News

Articles | Politics | Finance | Stocks | Crypto | AI | Technology | Science | Gaming | PC Hardware | Laptops | Smartphones | Archive

Anthropic Used Pokémon to Benchmark Its Newest AI Model

Anthropic used Pokémon to benchmark its newest AI model, Claude 3.7 Sonnet, in a blog post published Monday. In this experiment, Anthropic's latest model demonstrated the ability to engage in "extended thinking" and successfully battled three Pokémon gym leaders. By leveraging the simple yet engaging gameplay of Pokémon Red, Anthropic was able to test the limits of its AI model.

See Also

Super Mario to Benchmark AI Performance. Δ1.82

Researchers at Hao AI Lab have used Super Mario Bros. as a benchmark for AI performance, with Anthropic's Claude 3.7 performing the best, followed by Claude 3.5. This unexpected choice highlights the limitations of traditional benchmarks in evaluating AI capabilities. The lab's approach demonstrates the need for more nuanced and realistic evaluation methods to assess AI intelligence.

AI Startup Anthropic Valued at $61.5B After Latest Funding Round. Δ1.81

Anthropic has secured a significant influx of capital, with its latest funding round valuing the company at $61.5 billion post-money. The Amazon- and Google-backed AI startup plans to use this investment to advance its next-generation AI systems, expand its compute capacity, and accelerate international expansion. Anthropic's recent announcements, including Claude 3.7 Sonnet and Claude Code, demonstrate its commitment to developing AI technologies that can augment human capabilities.

Anthropic Raises $3.5B to Fuel Its AI Ambitions Δ1.78

AI startup Anthropic has successfully raised $3.5 billion in a Series E funding round, achieving a post-money valuation of $61.5 billion, with notable participation from major investors including Lightspeed Venture Partners and Amazon. The new funding will support Anthropic's goal of advancing next-generation AI systems, enhancing compute capacity, and expanding its international presence while aiming for profitability through new tools and subscription models. Despite a robust annual revenue growth, the company faces significant operational costs, projecting a $3 billion burn rate this year.

DuckDuckGo Leans Further Into GenAI as Its AI Chat Interface Exits Beta Δ1.76

DuckDuckGo is expanding its use of generative AI in both its conventional search engine and new AI chat interface, Duck.ai. The company has been integrating AI models developed by major providers like Anthropic, OpenAI, and Meta into its product for the past year, and has now exited beta for its chat interface. Users can access these AI models through a conversational interface that generates answers to their search queries.

AI Model Evolution: Increased Size Brings Greater Capabilities but Higher Costs Δ1.76

OpenAI has begun rolling out its newest AI model, GPT-4.5, to users on its ChatGPT Plus tier, promising a more advanced experience with its increased size and capabilities. However, the new model's high costs are raising concerns about its long-term viability. The rollout comes after GPT-4.5 launched for subscribers to OpenAI’s $200-a-month ChatGPT Pro plan last week.

Google Releases SpeciesNet, an AI Model Designed to Identify Wildlife Δ1.75

Google has open-sourced an AI model, SpeciesNet, designed to identify animal species by analyzing photos from camera traps. Researchers around the world use camera traps — digital cameras connected to infrared sensors — to study wildlife populations. But while these traps can provide valuable insights, they generate massive volumes of data that take days to weeks to sift through.

AI Scholars Win Turing Prize for Technique That Made Possible AlphaGo's Chess Triumph Δ1.75

Andrew G. Barto and Richard S. Sutton have been awarded the 2025 Turing Award for their pioneering work in reinforcement learning, a key technique that has enabled significant achievements in artificial intelligence, including Google's AlphaZero. This method operates by allowing computers to learn through trial and error, forming strategies based on feedback from their actions, which has profound implications for the development of intelligent systems. Their contributions not only laid the mathematical foundations for reinforcement learning but also sparked discussions on its potential role in understanding creativity and intelligence in both machines and living beings.

DuckDuckGo Embeds AI Search Tool but Keeps Users' Options Δ1.75

DuckDuckGo's recent development of its AI-generated search tool, dubbed DuckDuckAI, marks a significant step forward for the company in enhancing user experience and providing more concise responses to queries. The AI-powered chatbot, now out of beta, will integrate web search within its conversational interface, allowing users to seamlessly switch between the two options. This move aims to provide a more flexible and personalized experience for users, while maintaining DuckDuckGo's commitment to privacy.

The AI Chatbot Showdown Reveals No Clear Winner Δ1.74

GPT-4.5 and Google's Gemini Flash 2.0, two of the latest entrants to the conversational AI market, have been put through their paces to see how they compare. While both models offer some similarities in terms of performance, GPT-4.5 emerged as the stronger performer with its ability to provide more detailed and nuanced responses. Gemini Flash 2.0, on the other hand, excelled in its translation capabilities, providing accurate translations across multiple languages.

The AI Research Gap: Comparing Deep Research with Standard ChatGPT Models Δ1.74

Deep Research on ChatGPT provides comprehensive, in-depth answers to complex questions, but often at a cost of brevity and practical applicability. While it delivers detailed mini-reports that are perfect for trivia enthusiasts or those seeking nuanced analysis, its lengthy responses may not be ideal for everyday users who need concise information. The AI model's database and search tool can resolve most day-to-day queries, making it a reliable choice for quick answers.

The Ai Chatbot App Gains Global Momentum as Deepseek Surpasses U.s. Competition Δ1.74

DeepSeek has broken into the mainstream consciousness after its chatbot app rose to the top of the Apple App Store charts (and Google Play, as well). DeepSeek's AI models, trained using compute-efficient techniques, have led Wall Street analysts — and technologists — to question whether the U.S. can maintain its lead in the AI race and whether the demand for AI chips will sustain. The company's ability to offer a general-purpose text- and image-analyzing system at a lower cost than comparable models has forced domestic competition to cut prices, making some models completely free.

Anthropic Quietly Removes Biden-Era AI Policy Commitments From Its Website Δ1.74

Anthropic has quietly removed several voluntary commitments the company made in conjunction with the Biden administration to promote safe and "trustworthy" AI from its website, according to an AI watchdog group. The deleted commitments included pledges to share information on managing AI risks across industry and government and research on AI bias and discrimination. Anthropic had already adopted some of these practices before the Biden-era commitments.

Anthropic's Claude Code Tool Had a Bug that 'Bricked' Some Systems Δ1.74

Anthropic's coding tool, Claude Code, is off to a rocky start due to the presence of buggy auto-update commands that broke some systems. When installed at certain permissions levels, these commands allowed applications to modify restricted file directories and, in extreme cases, "brick" systems by changing their access permissions. Anthropic has since removed the problematic commands and provided users with a troubleshooting guide.

AI Takes Center Stage as Alibaba Drives Shares Higher Δ1.74

Alibaba Group's release of an artificial intelligence (AI) reasoning model has driven its Hong Kong-listed shares more than 8% higher on Thursday, outperforming global hit DeepSeek's R1. The company's AI unit claims that its QwQ-32B model can achieve performance comparable to top models like OpenAI's o1 mini and DeepSeek's R1. Alibaba's new model is accessible via its chatbot service, Qwen Chat, allowing users to choose various Qwen models.

The Future of Ai Tech Advances at Breakneck Pace Δ1.74

One week in tech has seen another slew of announcements, rumors, reviews, and debate. The pace of technological progress is accelerating rapidly, with AI advancements being a major driver of innovation. As the field continues to evolve, we're seeing more natural and knowledgeable chatbots like ChatGPT, as well as significant updates to popular software like Photoshop.

Anthropic Quietly Scrubs Biden-Era Responsible AI Commitment From Its Website Δ1.74

Anthropic appears to have removed its commitment to creating safe AI from its website, alongside other big tech companies. The deleted language promised to share information and research about AI risks with the government, as part of the Biden administration's AI safety initiatives. This move follows a tonal shift in several major AI companies, taking advantage of changes under the Trump administration.

Distilling AI Models Costs Less, Raises Revenue Questions Δ1.74

Developers can access AI model capabilities at a fraction of the price thanks to distillation, allowing app developers to run AI models quickly on devices such as laptops and smartphones. The technique uses a "teacher" LLM to train smaller AI systems, with companies like OpenAI and IBM Research adopting the method to create cheaper models. However, experts note that distilled models have limitations in terms of capability.

AI Versus the Brain and the Race for General Intelligence Δ1.74

The ongoing debate about artificial general intelligence (AGI) emphasizes the stark differences between AI systems and the human brain, which serves as the only existing example of general intelligence. Current AI, while capable of impressive feats, lacks the generalizability, memory integration, and modular functionality that characterize brain operations. This raises important questions about the potential pathways to achieving AGI, as the methods employed by AI diverge significantly from those of biological intelligence.

Amazon Is Reportedly Developing Its Own AI 'Reasoning' Model Δ1.74

Amazon is reportedly venturing into the development of an AI model that emphasizes advanced reasoning capabilities, aiming to compete with existing models from OpenAI and DeepSeek. Set to launch under the Nova brand as early as June, this model seeks to combine quick responses with more complex reasoning, enhancing reliability in fields like mathematics and science. The company's ambition to create a cost-effective alternative to competitors could reshape market dynamics in the AI industry.

The Ai Bubble Bursts: How Deepseek's R1 Model Is Freeing Artificial Intelligence From the Grip of Elites Δ1.74

DeepSeek R1 has shattered the monopoly on large language models, making AI accessible to all without financial barriers. The release of this open-source model is a direct challenge to the business model of companies that rely on selling expensive AI services and tools. By democratizing access to AI capabilities, DeepSeek's R1 model threatens the lucrative industry built around artificial intelligence.

Google Tests an AI-Only Version of Its Search Engine Δ1.74

Alphabet's Google has introduced an experimental search engine that replaces traditional search results with AI-generated summaries, available to subscribers of Google One AI Premium. This new feature allows users to ask follow-up questions directly in a redesigned search interface, which aims to enhance user experience by providing more comprehensive and contextualized information. As competition intensifies with AI-driven search tools from companies like Microsoft, Google is betting heavily on integrating AI into its core business model.

AI Stocks on Wall Street's Radar Right Now: A New Generation of Ad Platforms Under Scrutiny Δ1.73

AppLovin Corporation (NASDAQ:APP) is pushing back against allegations that its AI-powered ad platform is cannibalizing revenue from advertisers, while the company's latest advancements in natural language processing and creative insights are being closely watched by investors. The recent release of OpenAI's GPT-4.5 model has also put the spotlight on the competitive landscape of AI stocks. As companies like Tencent launch their own AI models to compete with industry giants, the stakes are high for those who want to stay ahead in this rapidly evolving space.

Openai’s Largest Ai Model Ever Arrives to Mixed Reviews Δ1.73

GPT-4.5 offers marginal gains in capability but poor coding performance despite being 30 times more expensive than GPT-4o. The model's high price and limited value are likely due to OpenAI's decision to shift focus from traditional LLMs to simulated reasoning models like o3. While this move may mark the end of an era for unsupervised learning approaches, it also opens up new opportunities for innovation in AI.

The Rise of AI-Powered Search: Google's New Chatbot Mode Changes Everything Δ1.73

Google is revolutionizing its search engine with the introduction of AI Mode, an AI chatbot that responds to user queries. This new feature combines advanced AI models with Google's vast knowledge base, providing hyper-specific answers and insights about the real world. The AI Mode chatbot, powered by Gemini 2.0, generates lengthy answers to complex questions, making it a game-changer in search and information retrieval.

The Rise of Google's AI Mode in Search: A New Frontier in Information Synthesis Δ1.73

Google's AI Mode offers reasoning and follow-up responses in search, synthesizing information from multiple sources unlike traditional search. The new experimental feature uses Gemini 2.0 to provide faster, more detailed, and capable of handling trickier queries. AI Mode aims to bring better reasoning and more immediate analysis to online time, actively breaking down complex topics and comparing multiple options.