AI IN BRIEF OpenAI is rolling out upgrades for GPT-4 that will, among other things, allow the AI model to answer queries from a user about a submitted image – and the super-lab has at least documented some safety risks involving that ability.
The aim of this new functionality is simple: a user can upload a picture file, and via ChatGPT ask the upgraded GPT-4 questions about this image, which it’ll try to answer. An OpenAI write-up describing this GPT-4V update (where the V stands for vision) disclosed the biz has been working on adding safeguards to limit the neural network’s potential to expose private data or generate inappropriate outputs when handling submitted images.
OpenAI has, for example, tried to block the model’s ability to recognize faces or exact locations from uploaded pictures as well as refrain from commenting on people’s appearances in submitted snaps, we’re told. Additional defenses include preventing the LLM from automatically solving CAPTCHAs or describing illicit behavior, and trying to reduce its tendency to generate false information.
“In some cases, it could also fail to identify information from images. It could miss text or characters, overlook mathematical symbols, and be unable to recognize spatial locations and color mappings,” the outfit warned in its paper [PDF] describing GPT-4V.
The model’s limitations mean the LLM isn’t well suited for performing some tasks, especially ones that are risky, such as identifying illegal drugs or safe-to-eat mushrooms. OpenAI also warned that GPT-4V, as usual for a GPT-4 model, has the ability to generate text and images that could be used to spread effective disinformation at a large scale.
“Previous work has shown that people are more likely to believe true and false statements when they’re presented alongside an image, and have false recall of made up headlines when they are accompanied with a photo. It is also known that engagement with content increases when it is associated with an image,” it said.
In practical terms, GPT-4V and its image-processing capabilities can be used via OpenAI’s ChatGPT by Plus users. Meanwhile, OpenAI is deploying voice input support to iOS and Android for ChatGPT Plus users. “You can now use voice to engage in a back-and-forth conversation with your assistant,” the biz said.
We earlier wrote about the mysterious French AI startup Mistral, and now the biz has released – via a Magnet link – a 7.3-billion-parameter large language model that it claims outperforms some rivals. It’s also said to be unmoderated and uncensored, so it can be used to produce questionable output as well as the usual stuff these LLMs can do from prompts. Use… as you wish, we guess.
“The Mistral 7B Instruct model is a quick demonstration that the base model can be easily fine-tuned to achieve compelling performance,” the biz said. “It does not have any moderation mechanism. We’re looking forward to engaging with the community on ways to make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs.”
Meta scales up context window for Llama 2 models
Meta has expanded the length of text users can input to its Llama 2 models to up to 32,768 tokens or chunks of words, dubbing the resulting systems Llama 2 Long.
Increasing the length of the input prompt means that the models can process more data to carry out more complex tasks, such as summarizing big reports or searching for information over longer contexts.
Bear in mind: Anthropic’s Claude model can process up to 100,000 tokens, an amount of text equivalent to 75,000 words, or hundreds of pages of prose. In a paper [PDF] put out last week, Meta claimed its top 70-billion-parameter large language model Llama 2 Long, perhaps unsurprisingly, outperforms OpenAI’s GPT-3.5-turbo model with a context window of 16,000 tokens.
Meta has been applauded by some for releasing its Llama 2 models for developers and academics to tinker with. But not everyone’s happy. Protesters stood outside the mega-corp’s office in San Francisco on Friday to raise awareness of the dangers and risks of releasing the models’ weights, which allows miscreants to use the models without any additional safeguards.
“Meta’s release policy for frontier AI models is fundamentally unsafe … Before it releases even more advanced models – which can have more dangerous capabilities in the hands of bad actors – we call on Meta to take responsible release seriously and stop irreversible proliferation,” the protest group said in a statement. The protest itself was organized on Meta’s Facebook and very lightly attended.
Amazon exec confirms Alexa may use your voice for AI training
Departing Amazon exec Dave Limp told Bloomberg TV the other day he reckons the web giant’s Alexa digital assistant will increasingly become a pay-to-play service. Crucially, he also said Alexa may use some people’s conversations with the AI system to train Amazon’s large language model Alexa LLM.
“Customers can still access the same robust set of tools and privacy controls that put them in control of their Alexa experience today,” an Amazon spokesperson told NBC News. “For example, customers will always know when Alexa is listening to their request because the blue light indicator will glow and an optional audible tone will sound.”
It’s maybe time to check and change your settings.
Lab sets up research initiative to study security in AI
The US Department of Energy’s Oak Ridge National Laboratory announced the launch of the Center for AI Security Research (CAISER) to probe adversarial attacks on machine learning systems.
Researchers will collaborate with staff from other agencies, such as the Air Force Research Laboratory’s Information Directorate and the Department of Homeland Security Science and Technology Directorate, to assess and study security vulnerabilities in AI.
CAISER is mostly concerned with adversarial attacks, and how models can be exploited. Miscreants can poison systems by feeding junk data that can force algorithms to make incorrect predictions. Prompt injection attacks, for example, can direct a large language model to generate inappropriate and offensive text.
By understanding the impacts and analyzing the risks, it’s hoped CAISER can better inform federal agencies about existing software and capabilities as they consider adopting AI.
“We are at a crossroads. AI tools and AI-based technologies are inherently vulnerable and exploitable, which can lead to unforeseen consequences,” Edmon Begoli, ORNL’s Advanced Intelligent Systems section head and CAISER founding director, said in a statement.
“We’re defining a new field of AI security research and committing to intensive research and development of mitigating strategies and solutions against emerging AI risks.”
AWS launches AI Bedrock platform
Amazon’s cloud unit AWS earlier announced its Bedrock platform, which hosts foundation models via APIs for enterprises to train and run on the cloud giant’s hardware resources, is now generally available.
Developers can now access a number of models ranging from Meta’s Llama 2 to Amazon’s Titan Embeddings, which translates text into vector mappings for AI algorithms to process; the text-generating Amazon Titan Express and Amazon Titan Lite; and Amazon CodeWhisperer. AWS also hosts models built by other companies, such as AI21 Labs, Anthropic, Cohere, and Stability AI.
“With powerful, new innovations, AWS is bringing greater security, choice, and performance to customers, while also helping them to tightly align their data strategy across their organization, so they can make the most of the transformative potential of generative AI,” said Swami Sivasubramanian, vice president of data and AI at AWS.
AWS said enterprises from a range of industries are using Bedrock’s generative AI services, including sports clothing brand Adidas, automobile manufacturer BMW Group, LexisNexis Legal & Professional, and the US nonprofit golf tournament organization PGA Tour. ®
Source: OpenAI warns users over GPT-4 Vision’s limits and flaws • The Register