From Digital Age to Nano Age. WorldWide.

Tag: model

Robotic Automations

OpenAI's newest model is GPT-4o | TechCrunch


OpenAI is releasing a new flagship generative AI model called GPT-4o, set to roll out “iteratively” across the company’s developer and consumer-facing products over the next few weeks. (The “o” in GPT-4 stands for “omnimodel.”) OpenAI CTO Muri Murati said that GPT-4o provides “GPT-4-level” intelligence but improves on GPT-4’s capabilities across text and vision as […]

© 2024 TechCrunch. All rights reserved. For personal use only.


Software Development in Sri Lanka

Robotic Automations

U.K. agency releases tools to test AI model safety | TechCrunch


The U.K. Safety Institute, the U.K.’s recently established AI safety body, has released a toolset designed to “strengthen AI safety” by making it easier for industry, research organizations and academia to develop AI evaluations.  Called Inspect, the toolset — which is available under an open source license, specifically an MIT License — aims to assess certain […]

© 2024 TechCrunch. All rights reserved. For personal use only.


Software Development in Sri Lanka

Robotic Automations

Acura’s new all-electric SUV proves the most expensive model isn’t always the best | TechCrunch


The first electric vehicle I ever drove was a Tesla Roadster in 2011. I will never forget the feeling of the instant torque provided by the electric motor, propelling me to 60 miles per hour in less than four seconds — a feat my little Miata daily driver couldn’t even dream of doing.

I’ve had a bit of a love affair with that kind of acceleration ever since. But like all relationships, time has left me wanting a bit more– I want an EV that brings me as much joy in the twisties as it does on the highway on-ramps.

It was with great anticipation that I slid behind the wheel of the 2025 Acura ZDX Type S. Sure, it’s a mid-size SUV, but it wears the Type S moniker, a name reserved only for the most fun-to-drive in the Acura stable. Could this be the EV unicorn I have been looking for?

Y’all, it did not go as expected.

Nuts and bolts

Image Credits: Emme Hall

On launch, the ZDX will be available in A Spec and Type S trim — both of which come equipped with a 102 kWh battery. The A Spec will be available in rear-wheel drive with 313 miles of range and just over 350 horsepower while all-wheel drive drops the range to 304 miles but ups the power to 409 ponies.
The performance-oriented Type S gets power down to all four wheels and goes for broke with 499 horsepower and a whopping 544 pound-feet of torque. However, all those fast-moving electrons take a toll on range as the Type S can only go 278 miles on a full charge.

Although I didn’t get the chance to test the 190 kW charging capabilities of the ZDX, Acura says that it’s quick enough to add up to 81 miles in 10 minutes of charging and to go from 20% to 80% battery capacity in 42 minutes. However, offerings from Kia, Hyundai and Genesis can do it quicker.

When it comes to charging at home the ZDX sports a 11.5 kW onboard charger that Acura says can add nearly 30 miles in an hour, assuming a 60 amp wall charger.

The S stands for Sport, right?

Image Credits: Acura

Acura says the driver experience comes first in this new car, and that goes double for the enthusiast Type S. Unfortunately, the top trim doesn’t put a smile on my face.

Slip the car into sport mode and it hunkers down 15 millimeters, that’s just over a half an inch to us Yanks, while the brake and throttle get a bit more responsive and the already heavy steering gets a bit more weighty. The adaptive dampers firm up and the car produces a subtle but noticeable performance sound.

Combined with the 544 pound-feet of torque, this should make for a supremely fun car to drive, yet somehow– it just doesn’t.

The ZDX is a blast to launch on the freeway. Similarly, accelerating at higher speeds is also satisfying, and whipping around a Prius doing 55 in the fast lane is an easy job.

Still, I expected more joy from an Acura Type S vehicle.

Don’t misunderstand me, there is nothing inherently bad about the driving experience. And yet, slinging the SUV through the back roads of Santa Barbara, California felt clinical. Here’s where it went wrong.

The Type S weighs over 6,000 pounds. Even if the weight is evenly distributed front to rear, that’s a lot of heft to get around a turn. I like the hefty steering, but there isn’t much feedback happening. The torque is always there on corner exit and body roll is kept in check, yet I’m not feeling the delight.

The 275/40 Continental Premium Contact 6 summer tires on the Type S offer up plenty of grip, but the low-profile sidewall combined with the harder run-flat rubber compound means that the ride is just a touch harsh.

Of course, Acura knows how to build a proper Type S car. The new Integra Type S is a veritable riot to drive. I just wish the company brought the same engineering to this much larger and heavier sibling.

Braking in the ZDX is confident with big ol’ Brembo brakes up front and three levels of regeneration. You can turn regen all the way off, but why would you give up free electrons? It might take a bit of time to get used to the maximum regen, but it allows for full one-pedal driving, bringing the ZDX to a complete stop. Even if you’re not in maximum mode, you can still bring in more regen by pulling the steering wheel paddle on the left.

There is also a snow mode that raises the suspension almost a full inch, as well as a choose-your-own-adventure individual mode, but most folks will likely just keep the car in normal mode and again, that’s just fine.

Acura will go it alone

The all-electric ZDX isn’t entirely a Honda Motor vehicle. It was developed in partnership with General Motors, using the American company’s battery technology. Originally the plan was to develop a series of affordable EVs, but late last year that plan was nixed as demand for EVs slowed. However, Acura wants 100% of all products to be zero emissions by 2040 and it has set a target of net zero emissions for all products and corporate activities by 2050.

Designed through virtual and augmented reality in both the United States and Japan, the creatives at Acura clearly took the Precision EV concept we saw in 2022 at Monterey Car Week and called it good.

And mostly, they were right.

The car is nearly the same length overall as the mid-size MDX SUV, but the wheelbase is a full eight inches longer, pushing the wheels out to each corner for a somewhat aggressive stance. It sits lower than the MDX as well, giving the ZDX a bit of an “is-it-a-wagon-or-is-it-an-SUV” profile, especially with its squared off rear roofline. The rear end gives off some serious hearse design vibes, which depending on your aesthetic could be a good, great or bad exterior styling choice.

What Acura got right

Image Credits: Emme Hall

Acura has proven to be a master at color choices — the Tiger Eye Pearl and Double Apex Blue Pearl — are a welcome sight on a mid-size crossover. Acura even offers a red interior on any Type S with a normcore black, white or gray exterior color.

Inside, the center console of the ZDX definitely divides the cabin into driving space and riding space. I dig it. There is plenty of small item storage here and the console also has a basement level for larger items like laptops and purses.

All trims get power-adjustable leather seats that are heated and cooled and a heated steering wheel. The Type S also adds heated rear seats, tri-zone climate control, a digital rearview mirror and a head-up display.

Overall the ZDX is comfy with clean design lines and plenty of passenger and cargo space. Sure there are a few buttons and dials from the GM parts bin, but the design is very Acura. The rear seat is especially spacious, with more legroom than the competition from Germany and Korea. Behind the rear seats is 28.7 cubic feet of space including 5 cubes of underfloor storage. expanding to 62 cubic feet when the rear seats are folded.

Image Credits: Emme Hall

Anyone who has driven a GM product lately will immediately recognize the 11.3-inch infotainment interface. Google is built-in here and I think it’s a more user-friendly system than anything currently on offer from Acura so I ain’t even mad. What’s more, the Google-based navigation can be sent to the 11-inch digital gauge cluster and will optimize route planning for recharging. It can even initiate battery preconditioning. Wireless Apple CarPlay and Android Auto are here as well.

All trims of the ZDX get the Acura Watch suite of ADAS features that includes things like blind-spot monitoring, automatic emergency braking and the like. The Type S adds a few features, including the Hands Free Cruise system– essentially GM’s excellent Super Cruise technology. During my test drive I have one disengagement, when the lane markings disappear on some newly laid pavement. This is why drivers must always be paying attention, even with a hands-off system.

Like Super Cruise, the Acura Hands Free Cruise can be set to automatic lane changes, leaving the computer to decide if it’s safe to pass a slower-moving vehicle. The car performs its task well, safely moving one lane to the left in moderate traffic, it just surprises the hell out of me.

All Acura ZDX vehicles will be ordered online, either at home or at the dealership so you can still get some guidance should you need it. Further, Acura gives buyers a few charging perks with their new electric SUV. Options include a level 2 charger, a $500 credit towards installation and a $100 public charging credit, or a portable charger, a $250 home charger installation credit and a $300 public charging credit. For those who can’t charge at home, Acura also offers $750 worth of public charging.

While the original intent of the GM/Honda partnership was to eventually build an inexpensive EV, the 2025 Acura ZDX definitely ain’t it. Sure, it qualifies for a $7,500 tax credit, but my top ZDX Type S tester is $74,850 including destination charges, a tough pill to swallow when the fun factor just isn’t there.

Perhaps, the EV road worth traveling is behind the wheel of the less expensive A Spec.


Software Development in Sri Lanka

Robotic Automations

Snowflake releases a flagship generative AI model of its own | TechCrunch


All-around, highly generalizable generative AI models were the name of the game once, and they arguably still are. But increasingly, as cloud vendors large and small join the generative AI fray, we’re seeing a new crop of models focused on the deepest-pocketed potential customers: the enterprise.

Case in point: Snowflake, the cloud computing company, today unveiled Arctic LLM, a generative AI model that’s described as “enterprise-grade.” Available under an Apache 2.0 license, Arctic LLM is optimized for “enterprise workloads,” including generating database code, Snowflake says, and is free for research and commercial use.

“I think this is going to be the foundation that’s going to let us — Snowflake — and our customers build enterprise-grade products and actually begin to realize the promise and value of AI,” CEO Sridhar Ramaswamy said in press briefing. “You should think of this very much as our first, but big, step in the world of generative AI, with lots more to come.”

An enterprise model

My colleague Devin Coldewey recently wrote about how there’s no end in sight to the onslaught of generative AI models. I recommend you read his piece, but the gist is: Models are an easy way for vendors to drum up excitement for their R&D and they also serve as a funnel to their product ecosystems (e.g., model hosting, fine-tuning and so on).

Arctic LLM is no different. Snowflake’s flagship model in a family of generative AI models called Arctic, Arctic LLM — which took around three months, 1,000 GPUs and $2 million to train — arrives on the heels of Databricks’ DBRX, a generative AI model also marketed as optimized for the enterprise space.

Snowflake draws a direct comparison between Arctic LLM and DBRX in its press materials, saying Arctic LLM outperforms DBRX on the two tasks of coding (Snowflake didn’t specify which programming languages) and SQL generation. The company said Arctic LLM is also better at those tasks than Meta’s Llama 2 70B (but not the more recent Llama 3 70B) and Mistral’s Mixtral-8x7B.

Snowflake also claims that Arctic LLM achieves “leading performance” on a popular general language understanding benchmark, MMLU. I’ll note, though, that while MMLU purports to evaluate generative models’ ability to reason through logic problems, it includes tests that can be solved through rote memorization, so take that bullet point with a grain of salt.

“Arctic LLM addresses specific needs within the enterprise sector,” Baris Gultekin, head of AI at Snowflake, told TechCrunch in an interview, “diverging from generic AI applications like composing poetry to focus on enterprise-oriented challenges, such as developing SQL co-pilots and high-quality chatbots.”

Arctic LLM, like DBRX and Google’s top-performing generative model of the moment, Gemini 1.5 Pro, is a mixture of experts (MoE) architecture. MoE architectures basically break down data processing tasks into subtasks and then delegate them to smaller, specialized “expert” models. So, while Arctic LLM contains 480 billion parameters, it only activates 17 billion at a time — enough to drive the 128 separate expert models. (Parameters essentially define the skill of an AI model on a problem, like analyzing and generating text.)

Snowflake claims that this efficient design enabled it to train Arctic LLM on open public web data sets (including RefinedWeb, C4, RedPajama and StarCoder) at “roughly one-eighth the cost of similar models.”

Running everywhere

Snowflake is providing resources like coding templates and a list of training sources alongside Arctic LLM to guide users through the process of getting the model up and running and fine-tuning it for particular use cases. But, recognizing that those are likely to be costly and complex undertakings for most developers (fine-tuning or running Arctic LLM requires around eight GPUs), Snowflake’s also pledging to make Arctic LLM available across a range of hosts, including Hugging Face, Microsoft Azure, Together AI’s model-hosting service, and enterprise generative AI platform Lamini.

Here’s the rub, though: Arctic LLM will be available first on Cortex, Snowflake’s platform for building AI- and machine learning-powered apps and services. The company’s unsurprisingly pitching it as the preferred way to run Arctic LLM with “security,” “governance” and scalability.

Our dream here is, within a year, to have an API that our customers can use so that business users can directly talk to data,” Ramaswamy said. “It would’ve been easy for us to say, ‘Oh, we’ll just wait for some open source model and we’ll use it. Instead, we’re making a foundational investment because we think [it’s] going to unlock more value for our customers.”

So I’m left wondering: Who’s Arctic LLM really for besides Snowflake customers?

In a landscape full of “open” generative models that can be fine-tuned for practically any purpose, Arctic LLM doesn’t stand out in any obvious way. Its architecture might bring efficiency gains over some of the other options out there. But I’m not convinced that they’ll be dramatic enough to sway enterprises away from the countless other well-known and -supported, business-friendly generative models (e.g. GPT-4).

There’s also a point in Arctic LLM’s disfavor to consider: its relatively small context.

In generative AI, context window refers to input data (e.g. text) that a model considers before generating output (e.g. more text). Models with small context windows are prone to forgetting the content of even very recent conversations, while models with larger contexts typically avoid this pitfall.

Arctic LLM’s context is between ~8,000 and ~24,000 words, dependent on the fine-tuning method — far below that of models like Anthropic’s Claude 3 Opus and Google’s Gemini 1.5 Pro.

Snowflake doesn’t mention it in the marketing, but Arctic LLM almost certainly suffers from the same limitations and shortcomings as other generative AI models — namely, hallucinations (i.e. confidently answering requests incorrectly). That’s because Arctic LLM, along with every other generative AI model in existence, is a statistical probability machine — one that, again, has a small context window. It guesses based on vast amounts of examples which data makes the most “sense” to place where (e.g. the word “go” before “the market” in the sentence “I go to the market”). It’ll inevitably guess wrong — and that’s a “hallucination.”

As Devin writes in his piece, until the next major technical breakthrough, incremental improvements are all we have to look forward to in the generative AI domain. That won’t stop vendors like Snowflake from championing them as great achievements, though, and marketing them for all they’re worth.


Software Development in Sri Lanka

Robotic Automations

Tesla launches new Model 3 Performance variant to rev up demand | TechCrunch


Tesla has officially revealed a new Performance variant of the recently-refreshed Model 3 sedan as the company looks to fight off receding demand.

The new version of the Model 3, which starts at $52,990, has a new active damping system and adaptive suspension for better handling and comfort, 296 miles of battery range and can travel from 0 to 60 miles per hour in 2.9 seconds with 510 horsepower on offer.

Compared to the previous Model 3 Performance, the new version has 32% more peak power and 16% more peak torque, and 5% less drag. It does all this while consuming less energy than its predecessor, according to Tesla. That’s thanks in part to a new-generation drive unit, and also a rear diffuser and spoiler. The front and rear ends of the car have also benefited from a slight facelift, separating it from the other versions of the newly-tweaked Model 3 revealed last year.

The Model 3 Performance still carries with it the wholesale changes made with that recent refresh. That means there’s an ambient light bar wrapping around the cabin interior, better sound dampening and upgraded materials throughout, a stalk-less steering wheel and a new touchscreen display.

Tesla is launching the new Model 3 Performance at a time when the company is coming off one of its worst quarters for deliveries in recent memory, having dropped 20% compared to the fourth quarter of 2023. The impact of that disappointing first quarter is set to be revealed Tuesday when the company publishes its financial results after the market closes.

Tesla is also just one week removed from announcing sweeping layoffs of more than 10% to its global workforce, with the cuts affecting seemingly all corners of the company.

Orders placed Tuesday, at least at the time of publication, show an estimated delivery window of May/June 2024 in North America.


Software Development in Sri Lanka

Robotic Automations

Adobe claims its new image generation model is its best yet | TechCrunch


Firefly, Adobe’s family of generative AI models, doesn’t have the best reputation among creatives.

The Firefly image generation model in particular has been derided as underwhelming and flawed compared to Midjourney, OpenAI’s DALL-E 3, and other rivals, with a tendency to distort limbs and landscapes and miss the nuances in prompts. But Adobe is trying to right the ship with its third-generation model, Firefly Image 3, releasing this week during the company’s Max London conference.

The model, now available in Photoshop (beta) and Adobe’s Firefly web app, produces more “realistic” imagery than its predecessor (Image 2) and its predecessor’s predecessor (Image 1) thanks to an ability to understand longer, more complex prompts and scenes as well as improved lighting and text generation capabilities. It should more accurately render things like typography, iconography, raster images and line art, says Adobe, and is “significantly” more adept at depicting dense crowds and people with “detailed features” and “a variety of moods and expressions.”

For what it’s worth, in my brief unscientific testing, Image 3 does appear to be a step up from Image 2.

I wasn’t able to try Image 3 myself. But Adobe PR sent a few outputs and prompts from the model, and I managed to run those same prompts through Image 2 on the web to get samples to compare the Image 3 outputs with. (Keep in mind that the Image 3 outputs could’ve been cherry-picked.)

Notice the lighting in this headshot from Image 3 compared to the one below it, from Image 2:

From Image 3. Prompt: “Studio portrait of young woman.”

Same prompt as above, from Image 2.

The Image 3 output looks more detailed and lifelike to my eyes, with shadowing and contrast that’s largely absent from the Image 2 sample.

Here’s a set of images showing Image 3’s scene understanding at play:

From Image 3. Prompt: “An artist in her studio sitting at desk looking pensive with tons of paintings and ethereal.”

Same prompt as above. From Image 2.

Note the Image 2 sample is fairly basic compared to the output from Image 3 in terms of the level of detail — and overall expressiveness. There’s wonkiness going on with the subject in the Image 3 sample’s shirt (around the waist area), but the pose is more complex than the subject’s from Image 2. (And Image 2’s clothes are also a bit off.)

Some of Image 3’s improvements can no doubt be traced to a larger and more diverse training data set.

Like Image 2 and Image 1, Image 3 is trained on uploads to Adobe Stock, Adobe’s royalty-free media library, along with licensed and public domain content for which the copyright has expired. Adobe Stock grows all the time, and consequently so too does the available training data set.

In an effort to ward off lawsuits and position itself as a more “ethical” alternative to generative AI vendors who train on images indiscriminately (e.g. OpenAI, Midjourney), Adobe has a program to pay Adobe Stock contributors to the training data set. (We’ll note that the terms of the program are rather opaque, though.) Controversially, Adobe also trains Firefly models on AI-generated images, which some consider a form of data laundering.

Recent Bloomberg reporting revealed AI-generated images in Adobe Stock aren’t excluded from Firefly image-generating models’ training data, a troubling prospect considering those images might contain regurgitated copyrighted material. Adobe has defended the practice, claiming that AI-generated images make up only a small portion of its training data and go through a moderation process to ensure they don’t depict trademarks or recognizable characters or reference artists’ names.

Of course, neither diverse, more “ethically” sourced training data nor content filters and other safeguards guarantee a perfectly flaw-free experience — see users generating people flipping the bird with Image 2. The real test of Image 3 will come once the community gets its hands on it.

New AI-powered features

Image 3 powers several new features in Photoshop beyond enhanced text-to-image.

A new “style engine” in Image 3, along with a new auto-stylization toggle, allows the model to generate a wider array of colors, backgrounds and subject poses. They feed into Reference Image, an option that lets users condition the model on an image whose colors or tone they want their future generated content to align with.

Three new generative tools — Generate Background, Generate Similar and Enhance Detail — leverage Image 3 to perform precision edits on images. The (self-descriptive) Generate Background replaces a background with a generated one that blends into the existing image, while Generate Similar offers variations on a selected portion of a photo (a person or an object, for example). As for Enhance Detail, it “fine-tunes” images to improve sharpness and clarity.

If these features sound familiar, that’s because they’ve been in beta in the Firefly web app for at least a month (and Midjourney for much longer than that). This marks their Photoshop debut — in beta.

Speaking of the web app, Adobe isn’t neglecting this alternate route to its AI tools.

To coincide with the release of Image 3, the Firefly web app is getting Structure Reference and Style Reference, which Adobe’s pitching as new ways to “advance creative control.” (Both were announced in March, but they’re now becoming widely available.) With Structure Reference, users can generate new images that match the “structure” of a reference image — say, a head-on view of a race car. Style Reference is essentially style transfer by another name, preserving the content of an image (e.g. elephants in the African Safari) while mimicking the style (e.g. pencil sketch) of a target image.

Here’s Structure Reference in action:

Original image.

Transformed with Structure Reference.

And Style Reference:

Original image.

Transformed with Style Reference.

I asked Adobe if, with all the upgrades, Firefly image generation pricing would change. Currently, the cheapest Firefly premium plan is $4.99 per month — undercutting competition like Midjourney ($10 per month) and OpenAI (which gates DALL-E 3 behind a $20-per-month ChatGPT Plus subscription).

Adobe said that its current tiers will remain in place for now, along with its generative credit system. It also said that its indemnity policy, which states Adobe will pay copyright claims related to works generated in Firefly, won’t be changing either, nor will its approach to watermarking AI-generated content. Content Credentials — metadata to identify AI-generated media — will continue to be automatically attached to all Firefly image generations on the web and in Photoshop, whether generated from scratch or partially edited using generative features.




Software Development in Sri Lanka

Robotic Automations

Poe introduces a price-per-message revenue model for AI bot creators | TechCrunch


Bot creators now have a new way to make money with Poe, the Quora-owned AI chatbot platform. On Monday, the company introduced a revenue model that allows creators to set a per-message price for their bots so they can make money whenever a user messages them. The addition follows an October 2023 release of a revenue-sharing program that would give bot creators a cut of the earnings when their users subscribed to Poe’s premium product.

First launched by Quora in February 2023, Poe offers users the ability to sample a variety of AI chatbots, including those from ChatGPT maker OpenAI, Anthropic, Google, and others. The idea is to give consumers an easy way to toy with new AI technologies all in one place while also giving Quora a potential source of new content.

The company’s revenue models offer a new twist on the creator economy by rewarding AI enthusiasts who generate “prompt bots,” as well as developer-built server bots that integrate with Poe’s AI.

Last fall, Quora announced it would begin a revenue-sharing program with bot creators and said it would “soon” open up the option for creators to set a per-message fee on their bots. Although it’s been nearly 5 months since that announcement — hardly “soon” — the latter is now going live.

Quora CEO Adam D’Angelo explained on Monday that Poe users will only see message points for each bot, which encompasses the same points they have as either a free user or Poe subscriber. However, creators will be paid in dollars, he said.

“This pricing mechanism is important for developers with substantial model inference or API costs,” D’Angelo noted in a post on X. “Our goal is to enable a thriving ecosystem of model developers and bot creators who build on top of models, and covering these operational costs is a key part of that,” he added.

The new revenue model could spur the development of new kinds of bots, including in areas like tutoring, knowledge, assistants, analysis, storytelling, and image generation, D’Angelo believes.

The offering is currently available to U.S. bot creators only but will expand globally in the future. It joins the creator monetization program that pays up to $20 per user who subscribes to Poe thanks to a creator’s bots.

Alongside the per-message revenue model, Poe also launched an enhanced analytics dashboard that displays average earnings for creators’ bots across paywalls, subscriptions, and messages. Its insights are updated daily and will allow creators to get a better handle on how their pricing drives bot usage and revenue.




Software Development in Sri Lanka

Robotic Automations

Google open sources tools to support AI model development | TechCrunch


In a typical year, Cloud Next — one of Google’s two major annual developer conferences, the other being I/O — almost exclusively features managed and otherwise closed source, gated-behind-locked-down-APIs products and services. But this year, whether to foster developer goodwill or advance its ecosystem ambitions (or both), Google debuted a number of open source tools primarily aimed at supporting generative AI projects and infrastructure.

The first, MaxDiffusion, which Google actually quietly released in February, is a collection of reference implementations of various diffusion models — models like the image generator Stable Diffusion — that run on XLA devices. “XLA” stands for Accelerated Linear Algebra, an admittedly awkward acronym referring to a technique that optimizes and speeds up specific types of AI workloads, including fine-tuning and serving.

Google’s own tensor processing units (TPUs) are XLA devices, as are recent Nvidia GPUs.

Beyond MaxDiffusion, Google’s launching JetStream, a new engine to run generative AI models — specifically text-generating models (so not Stable Diffusion). Currently limited to supporting TPUs with GPU compatibility supposedly coming in the future, JetStream offers up to 3x higher “performance per dollar” for models like Google’s own Gemma 7B and Meta’s Llama 2, Google claims.

“As customers bring their AI workloads to production, there’s an increasing demand for a cost-efficient inference stack that delivers high performance,” Mark Lohmeyer, Google Cloud’s GM of compute and machine learning infrastructure, wrote in a blog post shared with TechCrunch. “JetStream helps with this need … and includes optimizations for popular open models such as Llama 2 and Gemma.”

Now, “3x” improvement is quite a claim to make, and it’s not exactly clear how Google arrived at that figure. Using which generation of TPU? Compared to which baseline engine? And how’s “performance” being defined here, anyway?

I’ve asked Google all these questions and will update this post if I hear back.

Second-to-last on the list of Google’s open source contributions are new additions to MaxText, Google’s collection of text-generating AI models targeting TPUs and Nvidia GPUs in the cloud. MaxText now includes Gemma 7B, OpenAI’s GPT-3 (the predecessor to GPT-4), Llama 2 and models from AI startup Mistral — all of which Google says can be customized and fine-tuned to developers’ needs.

We’ve heavily optimized [the models’] performance on TPUs and also partnered closely with Nvidia to optimize performance on large GPU clusters,” Lohmeyer said. “These improvements maximize GPU and TPU utilization, leading to higher energy efficiency and cost optimization.”

Finally, Google’s collaborated with Hugging Face, the AI startup, to create Optimum TPU, which provides tooling to bring certain AI workloads to TPUs. The goal is to reduce the barrier to entry for getting generative AI models onto TPU hardware, according to Google — in particular text-generating models.

But at present, Optimum TPU is a bit bare-bones. The only model it works with is Gemma 7B. And Optimum TPU doesn’t yet support training generative models on TPUs — only running them.

Google’s promising improvements down the line.


Software Development in Sri Lanka

Back
WhatsApp
Messenger
Viber