From Digital Age to Nano Age. WorldWide.

Tag: it039s

Robotic Automations

OpenAI says it's building a tool to let content creators 'opt out' of AI training | TechCrunch


OpenAI says it’s developing a tool to let creators better control how their content is used in AI.

Called Media Manager, the tool — once it’s released — will allow creators and content owners to identify their works to OpenAI and specify how they want those works to be included or excluded from AI research and training. The goal is to have the tool in place by 2025, OpenAI says, as the company works with creators, content owners and regulators toward a common standard.

“This will require cutting-edge machine learning research to build a first-ever tool of its kind to help us identify copyrighted text, images, audio and video across multiple sources and reflect creator preferences,” OpenAI writes in a blog post. “Over time, we plan to introduce additional choices and features.”


Software Development in Sri Lanka

Robotic Automations

It's a sunny day for Google Cloud | TechCrunch


Google Cloud, Google’s cloud computing division, had a blockbuster fiscal quarter, blowing past analysts’ expectations and sending Google parent company Alphabet’s stock soaring 13%+ in after-hours trading.

Google Cloud revenue jumped 28% to $9.57 billion in Q1 2024, bolstered by the demand for generative AI tools that rely on cloud infrastructure, services and apps. That continues a positive trend for the division, which in the previous quarter (Q4 2023) notched year-on-year growth of 25.66%.

Google Cloud’s operating income grew nearly 5x to $900 billion, up from $191 million. No doubt investors were pleased about this tidbit, along with Alphabet’s first-ever dividend (of 20 cents per share) and a $70 billion share repurchase program.

Elsewhere across Alphabet, Google Search and other revenue climbed 14.4% to $46.15 billion in the first fiscal quarter. YouTube revenue was up 20% year-over-year to $8.09 billion (a slight dip from Q4 2023 revenue of $9.2 billion), and Google’s overall advertising business gained 13% year-on-year to reach $61.6 billion.

Alphabet’s Other Bets category, which includes the company’s self-driving vehicle subsidiary Waymo, was the notable loser. Revenue grew 72% to $495 million in Q1, but Other Bets lost $1.02 billion — about the same as it lost in Q4 2023. (Other Bets typically isn’t profitable.)

Alphabet’s whole-org revenue stands at $80.5 billion, an increase of 15% year-over-year, with net income coming in at $23.7 billion (up 57%). Beyond Google Cloud’s performance, a reduced headcount might’ve contributed to the winning quarter; Alphabet reported a 5% drop in workforce to 180,895 employees.

On a call with investors, Alphabet CEO Sundar Pichai said that YouTube’s and Google’s cloud businesses are projected to reach a combined annual run rate of over $100 billion by the end of 2024. Last year, the divisions’ combined revenue was $64.59 billion, with Google Cloud raking in $33.08 billion and YouTube generating $31.51 billion.

“Taking a step back, it took Google more than 15 years to reach $100 billion in annual revenue,” Pichai said. “In just the last six years, we’ve gone from $100 billion to more than $300 billion in annual revenue. … This shows our track record of investing in and building successful new growing businesses.”


Software Development in Sri Lanka

Robotic Automations

Xaira, an AI drug discovery startup, launches with a massive $1B, says it's 'ready' to start developing drugs | TechCrunch


Advances in generative AI have taken the tech world by storm. Biotech investors are making a big bet that similar computational methods could revolutionize drug discovery.

On Tuesday, ARCH Venture Partners and Foresite Labs, an affiliate of Foresite Capital, announced that they incubated Xaira Therapeutics and funded the AI biotech with $1 billion. Other investors in the new company, which has been operating in stealth mode for about six months, include F-Prime, NEA, Sequoia Capital, Lux Capital, Lightspeed Venture Partners, Menlo Ventures, Two Sigma Ventures and SV Angel.

Xaira’s CEO Marc Tessier-Lavigne, a former Stanford president and chief scientific officer at Genentech, says the company is ready to start developing drugs that were impossible to make without recent breakthroughs in AI. “We’ve done such a large capital raise because we believe the technology is at an inflection point where it can have a transformative effect on the field,” he said.

The advances in foundational models come from the University of Washington’s Institute of Protein Design, run by David Baker, one of Xaira’s co-founders. These models are similar to diffusion models that power image generators like OpenAI’s DALL-E and Midjourney. But rather than creating art, Baker’s models aim to design molecular structures that can be made in a three-dimensional, physical world. 

While Xaira’s investors are convinced that the company can revolutionize data design, they emphasized that generative AI applications in biology are still in the early innings.

Vik Bajaj, CEO of Foresite Labs and managing director of Foresite Capital, said that unlike in technology, where data that train AI models is created by consumers, biology and medicine are “data poor. You have to create the datasets that drive model development.”

Other biotech companies using generative AI to design drugs include Recursion, which went public in 2021, and Genesis Therapeutics, a startup that last year raised a $200 million Series B co-led by Andreessen Horowitz.

The company declined to say when it expects to have its first drug available for human trials. However, ARCH Venture Partners managing director Bob Nelsen underscored that Xaira and its investors are ready to play the long game.

“You need billions of dollars to be a real drug company and also think AI. Both of those are expensive disciplines,” he said.  

Xaira wants to position itself as a powerhouse of AI drug discovery. However, some view bringing on Tessier-Lavigne as CEO as an unexpected move. Tessier-Lavigne resigned last year from his position as Stanford president amid allegations that his laboratory at Genetech manipulated research data.

But investors are confident that he is the right person for the job.

“I have known Marc for many years and know him to be a person of integrity and scientific vision who will be an exceptional CEO,” Nelsen said in an email. “Stanford exonerated him of any wrongdoing or scientific misconduct.”  


Software Development in Sri Lanka

Robotic Automations

Informatica makes a point to say it's not for sale — to Salesforce or anyone else | TechCrunch


Nothing gets us going like a big M&A rumor, and history has shown where there’s smoke there has often been fire — but that’s not always the case. Last week the big rumor involved Salesforce acquiring Informatica in a deal amounting to somewhere between the $6.5 billion 2018 MuleSoft deal and the $15.7 billion Tableau acquisition the following year.

It would have been a big deal, except it reportedly fizzled over the weekend — if it ever was a thing at all. Informatica went so far as to publicly announce on Monday that it wasn’t for sale.

“In addition, on April 12, 2024, The Wall Street Journal published a story that the Company was in advanced talks to be acquired, according to sources familiar with the matter. Although Informatica’s policy is not to comment on market rumors or media speculation, the Company announced that it is not currently engaged in any discussions to be acquired,” the company wrote in a press release on Monday.

You don’t usually see a company respond to rumors in this fashion, but Informatica felt compelled to publicly state it wasn’t in talks — with anyone.

As Constellation’s Ray Wang told TechCrunch on Friday, the deal never really made sense. “The potential acquisition of Informatica is quite curious as the client base and tech is not cutting-edge. Although it could potentially solve a data integration challenge that Salesforce has had, Data Cloud is already a strong offering, so I’m not sure if this deal makes sense.”

Salesforce, for its part, stuck to the tried and true policy of not commenting on rumors or speculation.


Software Development in Sri Lanka

Robotic Automations

Meta releases Llama 3, claims it's among the best open models available | TechCrunch


Meta has released the latest entry in its Llama series of open source generative AI models: Llama 3. Or, more accurately, the company has open sourced two models in its new Llama 3 family, with the rest to come at an unspecified future date.

Meta describes the new models — Llama 3 8B, which contains 8 billion parameters, and Llama 3 70B, which contains 70 billion parameters — as a “major leap” compared to the previous-gen Llama models, Llama 2 8B and Llama 2 70B, performance-wise. (Parameters essentially define the skill of an AI model on a problem, like analyzing and generating text; higher-parameter-count models are, generally speaking, more capable than lower-parameter-count models.) In fact, Meta says that, for their respective parameter counts, Llama 3 8B and Llama 3 70B — trained on two custom-built 24,000 GPU clusters — are are among the best-performing generative AI models available today.

That’s quite a claim to make. So how is Meta supporting it? Well, the company points to the Llama 3 models’ scores on popular AI benchmarks like MMLU (which attempts to measure knowledge), ARC (which attempts to measure skill acquisition) and DROP (which tests a model’s reasoning over chunks of text). As we’ve written about before, the usefulness — and validity — of these benchmarks is up for debate. But for better or worse, they remain one of the few standardized ways by which AI players like Meta evaluate their models.

Llama 3 8B bests other open source models like Mistral’s Mistral 7B and Google’s Gemma 7B, both of which contain 7 billion parameters, on at least nine benchmarks: MMLU, ARC, DROP, GPQA (a set of biology-, physics- and chemistry-related questions), HumanEval (a code generation test), GSM-8K (math word problems), MATH (another mathematics benchmark), AGIEval (a problem-solving test set) and BIG-Bench Hard (a commonsense reasoning evaluation).

Now, Mistral 7B and Gemma 7B aren’t exactly on the bleeding edge (Mistral 7B was released last September), and in a few of benchmarks Meta cites, Llama 3 8B scores only a few percentage points higher than either. But Meta also makes the claim that the larger-parameter-count Llama 3 model, Llama 3 70B, is competitive with flagship generative AI models including Gemini 1.5 Pro, the latest in Google’s Gemini series.

Image Credits: Meta

Llama 3 70B beats Gemini 1.5 Pro on MMLU, HumanEval and GSM-8K, and — while it doesn’t rival Anthropic’s most performant model, Claude 3 Opus — Llama 3 70B scores better than the weakest model in the Claude 3 series, Claude 3 Sonnet, on five benchmarks (MMLU, GPQA, HumanEval, GSM-8K and MATH).

Image Credits: Meta

For what it’s worth, Meta also developed its own test set covering use cases ranging from coding and creating writing to reasoning to summarization, and — surprise! — Llama 3 70B came out on top against Mistral’s Mistral Medium model, OpenAI’s GPT-3.5 and Claude Sonnet. Meta says that it gated its modeling teams from accessing the set to maintain objectivity, but obviously — given that Meta itself devised the test — the results have to be taken with a grain of salt.

Image Credits: Meta

More qualitatively, Meta says that users of the new Llama models should expect more “steerability,” a lower likelihood to refuse to answer questions, and higher accuracy on trivia questions, questions pertaining to history and STEM fields such as engineering and science and general coding recommendations. That’s in part thanks to a much larger data set: a collection of 15 trillion tokens, or a mind-boggling ~750,000,000,000 words — seven times the size of the Llama 2 training set. (In the AI field, “tokens” refers to subdivided bits of raw data, like the syllables “fan,” “tas” and “tic” in the word “fantastic.”)

Where did this data come from? Good question. Meta wouldn’t say, revealing only that it drew from “publicly available sources,” included four times more code than in the Llama 2 training data set, and that 5% of that set has non-English data (in ~30 languages) to improve performance on languages other than English. Meta also said it used synthetic data — i.e. AI-generated data — to create longer documents for the Llama 3 models to train on, a somewhat controversial approach due to the potential performance drawbacks.

“While the models we’re releasing today are only fine tuned for English outputs, the increased data diversity helps the models better recognize nuances and patterns, and perform strongly across a variety of tasks,” Meta writes in a blog post shared with TechCrunch.

Many generative AI vendors see training data as a competitive advantage and thus keep it and info pertaining to it close to the chest. But training data details are also a potential source of IP-related lawsuits, another disincentive to reveal much. Recent reporting revealed that Meta, in its quest to maintain pace with AI rivals, at one point used copyrighted ebooks for AI training despite the company’s own lawyers’ warnings; Meta and OpenAI are the subject of an ongoing lawsuit brought by authors including comedian Sarah Silverman over the vendors’ alleged unauthorized use of copyrighted data for training.

So what about toxicity and bias, two other common problems with generative AI models (including Llama 2)? Does Llama 3 improve in those areas? Yes, claims Meta.

Meta says that it developed new data-filtering pipelines to boost the quality of its model training data, and that it’s updated its pair of generative AI safety suites, Llama Guard and CybersecEval, to attempt to prevent the misuse of and unwanted text generations from Llama 3 models and others. The company’s also releasing a new tool, Code Shield, designed to detect code from generative AI models that might introduce security vulnerabilities.

Filtering isn’t foolproof, though — and tools like Llama Guard, CybersecEval and Code Shield only go so far. (See: Llama 2’s tendency to make up answers to questions and leak private health and financial information.) We’ll have to wait and see how the Llama 3 models perform in the wild, inclusive of testing from academics on alternative benchmarks.

Meta says that the Llama 3 models — which are available for download now, and powering Meta’s Meta AI assistant on Facebook, Instagram, WhatsApp, Messenger and the web — will soon be hosted in managed form across a wide range of cloud platforms including AWS, Databricks, Google Cloud, Hugging Face, Kaggle, IBM’s WatsonX, Microsoft Azure, Nvidia’s NIM and Snowflake. In the future, versions of the models optimized for hardware from AMD, AWS, Dell, Intel, Nvidia and Qualcomm will also be made available.

And more capable models are on the horizon.

Meta says that it’s currently training Llama 3 models over 400 billion parameters in size — models with the ability to “converse in multiple languages,” take more data in and understand images and other modalities as well as text, which would bring the Llama 3 series in line with open releases like Hugging Face’s Idefics2.

Image Credits: Meta

“Our goal in the near future is to make Llama 3 multilingual and multimodal, have longer context and continue to improve overall performance across core [large language model] capabilities such as reasoning and coding,” Meta writes in a blog post. “There’s a lot more to come.”

Indeed.


Software Development in Sri Lanka

Robotic Automations

Indian audio giant boAt says it's investigating suspected customer data breach | TechCrunch


India’s largest audio and wearables brand boAt is investigating a possible data breach after hackers advertised a cache of alleged customer data online.

A sample of alleged customer data was uploaded on a known cybercrime forum, which includes full names, phone numbers, email addresses, mailing addresses and order numbers. A portion of the data that TechCrunch reviewed appears genuine based on checks against exposed phone numbers.

The hacker said the breach happened in March, which led to the compromise of the data of more than 7.5 million customers.

In a statement emailed to TechCrunch, boAt said it was investigating the matter but did not disclose specifics.

“boAt is aware of recent claims regarding a potential data leak involving customer information. We take these claims seriously and have immediately launched a comprehensive investigation. At boAt, safeguarding customer data is our top priority,” the company said.

The leaked data includes references to Shopify. Indian outlet Athenil reported that the alleged hackers claimed the data was obtained by using credentials stolen from boAt’s systems.

boAt, which counts Warburg Pincus and South Lake Investment among its key investors, leads the market of wireless earbuds in India with nearly 34% share, according to data provided by IDC. boAt also dominates India’s wearables market, boasting some 26% of the market share.

In 2022, boAt, which was valued at $300 million in its Series B round of $100 million 2021, filed for its IPO to raise up to $266 million. The brand, however, postponed its public listing plans after seeing a slowdown in the public market.


Software Development in Sri Lanka

Robotic Automations

Meta thinks it's a good idea for students to wear Quest headsets in class | TechCrunch


Meta continues to field criticism over how it handles younger consumers using its platforms, but the company is also planning new products that will cater to them. On Monday, the company announced that later this year it will be launching a new education product for Quest to position its VR headset as a go-to device for teaching in classrooms.

The product is yet to be named, but in a blog post describing it, Nick Clegg, the company’s president of global affairs — the ex-politician who has become’s Meta’s executive most likely to be delivering messaging around more controversial and divisive topics — said that it will include a hub for education-specific apps and features, as well as the ability to manage multiple headsets at once without having to update each device individually.

Business models for hardware and services also have yet to be spelled out. With nothing on the table, the company is framing it as a long-term bet.

“We accept that it’s going to take a long time, and we’re not going to be making any money on this anytime soon,” Clegg said in an interview with Axios.

On the plus side, a push into education could mean more diversified content for Quest users, along with a wider ecosystem of developers building for the platform — not the killer app critics say is still missing from VR, but at least more action.

On more problematic ground, the news is coming on the heels of a few other developments at the company that are less positive. Meta’s instant messaging service WhatsApp has been getting a lot of heat over the fact that it is lowering the minimum age for users to 13 in the UK and EU (it had previously been 16).

Monday’s announcement arrives on the heels of Meta prompting Quest users to confirm their age so it can provide teens and preteens with appropriate experiences.

The new initiative will roll out later this year and will only be available to institutions with students 13 years old and up. Meta said it will launch it first in the 20 markets where it already supports Quest for Business, Meta’s workplace-focused $14.99/month subscription. That list includes the U.S. Canada, the United Kingdom and several other English-speaking markets, along with Japan and much of western Europe.

There are a number of companies already in the market exploring the idea of VR in the classroom, with names like ImmersionVR, ClassVR and ArborVR, not to mention the likes of Microsoft, which has been pushing its HoloLens as an educational tool for a while now.

It’s not clear how ubiquitous VR use is in schools: one provider, ClassVR, claims that 40,000 classrooms worldwide are using its products.

But all the same, there remain hurdles to mass market usage. It’s not clear, for example, whether strapping a headset to someone’s face is necessarily a help in a live, educational environment, considering some of the research around young people already getting too much screen time as it is.

And another big question mark will relate to the cost of buying headsets — Quest 3’s, the latest headsets, start at around $500 apiece for basic models — buying apps and then subsequently supporting all of that infrastructure. Meta said that it has already donated Quest headsets to 15 universities in the U.S., but it’s not clear how far it will go to subsidise growth longer-term. 

 


Software Development in Sri Lanka

Robotic Automations

Watch: How Anthropic found a trick to get AI to give you answers it's not supposed to


If you build it, people will try to break it. Sometimes even the people building stuff are the ones breaking it. Such is the case with Anthropic and its latest research which demonstrates an interesting vulnerability in current LLM technology. More or less if you keep at a question, you can break guardrails and wind up with large language models telling you stuff that they are designed not to. Like how to build a bomb.

Of course given progress in open-source AI technology, you can spin up your own LLM locally and just ask it whatever you want, but for more consumer-grade stuff this is an issue worth pondering. What’s fun about AI today is the quick pace it is advancing, and how well — or not — we’re doing as a species to better understand what we’re building.

If you’ll allow me the thought, I wonder if we’re going to see more questions and issues of the type that Anthropic outlines as LLMs and other new AI model types get smarter, and larger. Which is perhaps repeating myself. But the closer we get to more generalized AI intelligence, the more it should resemble a thinking entity, and not a computer that we can program, right? If so, we might have a harder time nailing down edge cases to the point when that work becomes unfeasible? Anyway, let’s talk about what Anthropic recently shared.


Software Development in Sri Lanka

Back
WhatsApp
Messenger
Viber