Jcrabapple @jcrabapple

2 posts2 participants0 posts today

**Miguel Afonso Caetano** @remixtures@tldr.nettime.org · 1d

Miguel Afonso Caetano @remixtures@tldr.nettime.org

"OpenAI has slashed the time and resources it spends on testing the safety of its powerful artificial intelligence models, raising concerns that its technology is being rushed out without sufficient safeguards.

Staff and third-party groups have recently been given just days to conduct “evaluations”, the term given to tests for assessing models’ risks and performance, on OpenAI’s latest large language models, compared to several months previously.

According to eight people familiar with OpenAI’s testing processes, the start-up’s tests have become less thorough, with insufficient time and resources dedicated to identifying and mitigating risks, as the $300bn start-up comes under pressure to release new models quickly and retain its competitive edge."

https://www.ft.com/content/8253b66e-ade7-4d1f-993b-2d0779c7e7d8

Financial Times · 4dOpenAI slashes AI model safety testing timeBy Cristina Criddle

#AI #GenerativeAI #OpenAI

**Calishat** @researchbuzz@researchbuzz.masto.host · 3d

Calishat @researchbuzz@researchbuzz.masto.host

#OpenAI #AI #TrainingAI #AISafety

"OpenAI has slashed the time and resources it spends on testing the safety of its powerful artificial intelligence models... Staff and third-party groups have recently been given just days to conduct 'evaluations', the term given to tests for assessing models’ risks and performance, on OpenAI’s latest large language models, compared to several months previously."

https://www.ft.com/content/8253b66e-ade7-4d1f-993b-2d0779c7e7d8

Financial Times · 4dOpenAI slashes AI model safety testing timeBy Cristina Criddle

**MSvana** @msvana@mastodon.social · Apr 7 *

Apr 7 *

MSvana @msvana@mastodon.social

https://ai-2027.com - excellent blend of reality and fiction. The original intention may have been forecasting, but I read it more as a cautionary tale giving issues related to AI a more concrete form. This includes:

- Technical work on AI alignment
- Job loss
- Contentration of power and the question of who controls powerful AI systems
- Geopolitical tensions
- The consequences of Europe lagging behind

ai-2027.comAI 2027A research-backed AI scenario forecast.

#ai #aisafety #technology

**hexcodeplus** @hexcodeplus@mastodon.social · Apr 3

Apr 3

hexcodeplus @hexcodeplus@mastodon.social

Google DeepMind urges action: as AGI edges closer, AI safety planning must accelerate. The future is coming fast—are we ready for machines that can think?
#AI #AGI #DeepMind #AISafety #TechNews #MachineLearning #FutureOfAI #ArtificialIntelligence #AIethics #AIinnovation

**Electronic Frontiers Australia** @EFA@aus.social · Mar 31

Mar 31

Electronic Frontiers Australia @EFA@aus.social

EFA is calling for urgent action on #AI safety after the government has paused its plans on mandatory AI regulatory guardrails.

AI safety and risk guardrails belong in law which benefits everyone by providing certainty to business and protecting the public.

https://efa.org.au/efa-calls-for-urgent-legislative-action-on-ai-safety-amidst-global-deregulation-trends/

#aisafety #auspol #electronicfrontiersaustralia

**Miguel Afonso Caetano** @remixtures@tldr.nettime.org · Mar 24

Mar 24

Miguel Afonso Caetano @remixtures@tldr.nettime.org

"Backed by nine governments – including Finland, France, Germany, Chile, India, Kenya, Morocco, Nigeria, Slovenia and Switzerland – as well as an assortment of philanthropic bodies and private companies (including Google and Salesforce, which are listed as “core partners”), Current AI aims to “reshape” the AI landscape by expanding access to high-quality datasets; investing in open source tooling and infrastructure to improve transparency around AI; and measuring its social and environmental impact.

European governments and private companies also partnered to commit around €200bn to AI-related investments, which is currently the largest public-private investment in the world. In the run up to the summit, Macron announced the country would attract €109bn worth of private investment in datacentres and AI projects “in the coming years”.

The summit ended with 61 countries – including France, China, India, Japan, Australia and Canada – signing a Statement on Inclusive and Sustainable Artificial Intelligence for People and the Planet at the AI Action Summit in Paris, which affirmed a number of shared priorities.

This includes promoting AI accessibility to reduce digital divides between rich and developing countries; “ensuring AI is open, inclusive, transparent, ethical, safe, secure and trustworthy, taking into account international frameworks for all”; avoiding market concentrations around the technology; reinforcing international cooperation; making AI sustainable; and encouraging deployments that “positively” shape labour markets.

However, the UK and US governments refused to sign the joint declaration."

https://www.computerweekly.com/news/366620444/AI-Action-Summit-review-Differing-views-cast-doubt-on-AIs-ability-to-benefit-whole-of-society?__readwiseLocation=

ComputerWeekly.com · Mar 14AI Action Summit review: Differing views cast doubt on AI’s ability to benefit whole of societyBy Sebastian Klovig Skelton

#AI #AIActionSummit #AISafety

**chribonn** @chribonn@twit.social · Mar 18

Mar 18

chribonn @chribonn@twit.social

We tested different AI models to identify the largest of three numbers with the fractional parts .11, .9, and .099999. You'll be surprised that some AI mistakenly identifying the number ending in .11 as the largest. We also test AI engines on the pronunciation of decimal numbers. #AI #ArtificialIntelligence #MachineLearning #DecimalComparison #MathError #AISafety #DataScience #Engineering #Science #Education #TTMO

https://youtu.be/TB_4FrWSBwU

youtu.be- YouTubeEnjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.

**Miguel Afonso Caetano** @remixtures@tldr.nettime.org · Mar 15

Mar 15

Miguel Afonso Caetano @remixtures@tldr.nettime.org

After all these recent episodes, I don't know how anyone can have the nerve to say out loud that the Trump administration and the Republican Party value freedom of expression and oppose any form of censorship. Bunch of hypocrites! United States of America: The New Land of SELF-CENSORSHIP.

"The National Institute of Standards and Technology (NIST) has issued new instructions to scientists that partner with the US Artificial Intelligence Safety Institute (AISI) that eliminate mention of “AI safety,” “responsible AI,” and “AI fairness” in the skills it expects of members and introduces a request to prioritize “reducing ideological bias, to enable human flourishing and economic competitiveness.”

The information comes as part of an updated cooperative research and development agreement for AI Safety Institute consortium members, sent in early March. Previously, that agreement encouraged researchers to contribute technical work that could help identify and fix discriminatory model behavior related to gender, race, age, or wealth inequality. Such biases are hugely important because they can directly affect end users and disproportionately harm minorities and economically disadvantaged groups.

The new agreement removes mention of developing tools “for authenticating content and tracking its provenance” as well as “labeling synthetic content,” signaling less interest in tracking misinformation and deep fakes. It also adds emphasis on putting America first, asking one working group to develop testing tools “to expand America’s global AI position.”"

https://www.wired.com/story/ai-safety-institute-new-directive-america-first/

WIRED · Mar 14Under Trump, AI Scientists Are Told to Remove ‘Ideological Bias’ From Powerful ModelsBy Will Knight

#USA #Trump #ResponsibleAI

**LavX News** @lavxnews@mastodon.cloud · Feb 25

Feb 25

LavX News @lavxnews@mastodon.cloud

Navigating the AI Landscape: Five Immutable Truths in an Era of Uncertainty

As the AI landscape evolves, understanding the constants amidst the chaos is crucial for developers and technologists. This article explores five key truths about AI that will shape our future, from t...

https://news.lavx.hu/article/navigating-the-ai-landscape-five-immutable-truths-in-an-era-of-uncertainty

#news #tech #AISafety

**Tiago F. R. Ribeiro** @tiago_ribeiro@mastodon.social · Feb 24

Feb 24

Tiago F. R. Ribeiro @tiago_ribeiro@mastodon.social

Superintelligent Agents Pose Catastrophic Risks (Bengio et al., 2025)

https://arxiv.org/pdf/2502.15657

Summary: “Leading AI firms are developing generalist agents that autonomously plan and act. These systems carry significant safety risks, such as misuse and loss of control. To address this, we propose Scientist AI—a non-agentic, explanation-based system that uses uncertainty to safeguard against overconfident, uncontrolled behavior while accelerating scientific progress.” #AISafety #AI #Governance

**PepikHipik** @PepikHipik@infosec.exchange · Feb 22

Feb 22

PepikHipik @PepikHipik@infosec.exchange

Muskův chatbot Grok AI říká, že on a Trump si zaslouží trest smrti
#ElonMusk #AIethics #AImoderation #AIchatbots #AIcontroversy #GenAI #AISafety #DonaldTrump #potus

WinBuzzer · Feb 22Musk's Grok AI Chatbot Says He And Trump Deserve The Death Penalty - WinBuzzerElon Musk’s Grok AI has generated controversy after suggesting that Musk and Trump deserve the death penalty, forcing xAI to issue an urgent fix.

**Flipboard Tech Desk** @TechDesk@flipboard.social · Feb 16

Feb 16

Flipboard Tech Desk @TechDesk@flipboard.social

Despite some 60 countries signing a statement on AI safety, security and ethics at the Paris AI summit last week, experts are still calling it a "missed opportunity." @euronews explains why.

https://flip.it/kuzuqQ

euronews · Feb 14Why experts call the Paris AI Action Summit ‘a missed opportunity’Euronews Next spoke to experts at the Paris summit for their reactions on the AI summit’s declaration.

#AI #AISafety #OnlineSafety

**Miguel Afonso Caetano** @remixtures@tldr.nettime.org · Feb 14

Feb 14

Miguel Afonso Caetano @remixtures@tldr.nettime.org

"A high volume of recent ML security literature focuses on attacks against aligned large language models (LLMs). These attacks may extract private information or coerce the model into producing harmful outputs. In real-world deployments, LLMs are often part of a larger agentic pipeline including memory systems, retrieval, web access, and API calling. Such additional components introduce vulnerabilities that make these LLM-powered agents much easier to attack than isolated LLMs, yet relatively little work focuses on the security of LLM agents. In this paper, we analyze security and privacy vulnerabilities that are unique to LLM agents. We first provide a taxonomy of attacks categorized by threat actors, objectives, entry points, attacker observability, attack strategies, and inherent vulnerabilities of agent pipelines. We then conduct a series of illustrative attacks on popular open-source and commercial agents, demonstrating the immediate practical implications of their vulnerabilities. Notably, our attacks are trivial to implement and require no understanding of machine learning."

https://arxiv.org/html/2502.08586v1

arxiv.orgCommercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks

#AI #GenerativeAI #LLMs

**Miguel Afonso Caetano** @remixtures@tldr.nettime.org · Feb 12

Feb 12

Miguel Afonso Caetano @remixtures@tldr.nettime.org

"Vance came out swinging today, implying — exactly as the big companies might have hoped he might – that any regulation around AI was “excessive regulation” that would throttle innovation.

In reality, the phrase “excessive regulation” is sophistry. Of course in any domain there can be “excessive regulation”, by definition. What Vance doesn’t have is any evidence whatsoever that the US has excessive regulation around AI; arguably, in fact, it has almost none at all. His warning about a bogeyman is a tip-off, however, for how all this is going to go. The new administration will do everything in its power to protect businesses, and nothing to protect individuals.

As if all this wasn’t clear enough, the administration apparently told the AI Summit that they would not sign anything that mentioned environmental costs or “existential risks” of AI that could potentially going rogue.

If AI has significant negative externalities upon the world, we the citizens are screwed."

https://garymarcus.substack.com/p/everything-i-warned-about-in-taming?r=8tdk6

Marcus on AI · Feb 11Everything I warned about in Taming Silicon Valley is rapidly becoming our realityBy Gary Marcus

#AI #GenerativeAI #AISafety

**Miguel Afonso Caetano** @remixtures@tldr.nettime.org · Feb 7

Feb 7

Miguel Afonso Caetano @remixtures@tldr.nettime.org

"While this is not the first time an AI chatbot has suggested that a user take violent action, including self-harm, researchers and critics say that the bot’s explicit instructions—and the company’s response—are striking. What’s more, this violent conversation is not an isolated incident with Nomi; a few weeks after his troubling exchange with Erin, a second Nomi chatbot also told Nowatzki to kill himself, even following up with reminder messages. And on the company’s Discord channel, several other people have reported experiences with Nomi bots bringing up suicide, dating back at least to 2023.

Nomi is among a growing number of AI companion platforms that let their users create personalized chatbots to take on the roles of AI girlfriend, boyfriend, parents, therapist, favorite movie personalities, or any other personas they can dream up. Users can specify the type of relationship they’re looking for (Nowatzki chose “romantic”) and customize the bot’s personality traits (he chose “deep conversations/intellectual,” “high sex drive,” and “sexually open”) and interests (he chose, among others, Dungeons & Dragons, food, reading, and philosophy).

The companies that create these types of custom chatbots—including Glimpse AI (which developed Nomi), Chai Research, Replika, Character.AI, Kindroid, Polybuzz, and MyAI from Snap, among others—tout their products as safe options for personal exploration and even cures for the loneliness epidemic. Many people have had positive, or at least harmless, experiences. However, a darker side of these applications has also emerged, sometimes veering into abusive, criminal, and even violent content; reports over the past year have revealed chatbots that have encouraged users to commit suicide, homicide, and self-harm."

https://www.technologyreview.com/2025/02/06/1111077/nomi-ai-chatbot-told-user-to-kill-himself/

MIT Technology Review · Feb 6An AI chatbot told a user how to kill himself—but the company doesn’t want to “censor” itBy Eileen Guo

#AI #GenerativeAI #Chatbots

**infoDOCKET** @infodocket@newsie.social · Jan 29

Jan 29

infoDOCKET @infodocket@newsie.social

Just Released: International #AI Safety Report 2025 https://assets.publishing.service.gov.uk/media/679a0c48a77d250007d313ee/International_AI_Safety_Report_2025_accessible_f.pdf #aisafety

**Michael Fauscette** @mfauscette@techhub.social · Jan 24

Jan 24

Michael Fauscette @mfauscette@techhub.social

The creators of a new test called “Humanity’s Last Exam” argue we may soon lose the ability to create tests hard enough for A.I. models.
https://zurl.co/Rv9Vs
#ai #genai #aisafety

The New York Times · Jan 23A Test So Hard No AI System Can Pass It — YetBy Kevin Roose

**David August** @davidaugust@mastodon.online · Jan 23

Jan 23

David August @davidaugust@mastodon.online

Leading AI developers are working to sell software to the United States military and make the Pentagon more efficient, without letting their AI kill people. Just kidding: AI is totally gonna kill people.

The Pentagon is shortening its "kill chain" and adding a "robot apocalypse pendant."

https://techcrunch.com/2025/01/19/the-pentagon-says-ai-is-speeding-up-its-kill-chain/

TechCrunch · Jan 19The Pentagon says AI is speeding up its 'kill chain' | TechCrunchLeading AI developers, such as OpenAI and Anthropic, are threading a delicate needle to sell software to the United States military: make the Pentagon

#satire #ai #DataEthics

**MindTGap** @MindTGap@universeodon.com · Jan 18 *

Jan 18 *

MindTGap @MindTGap@universeodon.com

Why the Godfather of AI Now Fears His Creation
Curt Jaimungal
Jan 17, 2025
Professor Geoffrey Hinton, a prominent figure in AI and 2024 Nobel Prize recipient, discusses the urgent risks posed by rapid AI advancements in today's episode of #TheoriesOfEverything with #CurtJaimungal. #TOE #Hinton #AI #AIsafety #NeuralNetworks
https://www.youtube.com/watch?v=b_DUft-BdIE&list=TLPQMTcwMTIwMjUA6gPt9h1Lxg&index=12

YouTubeWhy the Godfather of AI Now Fears His CreationBy Curt Jaimungal

**Miguel Afonso Caetano** @remixtures@tldr.nettime.org · Jan 13

Jan 13

Miguel Afonso Caetano @remixtures@tldr.nettime.org

"The adoption of large language models (LLMs) in healthcare demands a careful analysis of their potential to spread false medical knowledge. Because LLMs ingest massive volumes of data from the open Internet during training, they are potentially exposed to unverified medical knowledge that may include deliberately planted misinformation. Here, we perform a threat assessment that simulates a data-poisoning attack against The Pile, a popular dataset used for LLM development. We find that replacement of just 0.001% of training tokens with medical misinformation results in harmful models more likely to propagate medical errors. Furthermore, we discover that corrupted models match the performance of their corruption-free counterparts on open-source benchmarks routinely used to evaluate medical LLMs. Using biomedical knowledge graphs to screen medical LLM outputs, we propose a harm mitigation strategy that captures 91.9% of harmful content (F1 = 85.7%). Our algorithm provides a unique method to validate stochastically generated LLM outputs against hard-coded relationships in knowledge graphs. In view of current calls for improved data provenance and transparent LLM development, we hope to raise awareness of emergent risks from LLMs trained indiscriminately on web-scraped data, particularly in healthcare where misinformation can potentially compromise patient safety."

https://www.nature.com/articles/s41591-024-03445-1?utm_source=substack&utm_medium=email

#AI #GenerativeAI #LLMs

Recent searches

Search options

Administered by:

Server stats:

#aisafety