From Digital Age to Nano Age. WorldWide.

Tag: generative

Robotic Automations

Hugging Face releases a benchmark for testing generative AI on health tasks | TechCrunch


Generative AI models are increasingly being brought to healthcare settings — in some cases prematurely, perhaps. Early adopters believe that they’ll unlock increased efficiency while revealing insights that’d otherwise be missed. Critics, meanwhile, point out that these models have flaws and biases that could contribute to worse health outcomes.

But is there a quantitative way to know how helpful, or harmful, a model might be when tasked with things like summarizing patient records or answering health-related questions?

Hugging Face, the AI startup, proposes a solution in a newly released benchmark test called Open Medical-LLM. Created in partnership with researchers at the nonprofit Open Life Science AI and the University of Edinburgh’s Natural Language Processing Group, Open Medical-LLM aims to standardize evaluating the performance of generative AI models on a range of medical-related tasks.

Open Medical-LLM isn’t a from-scratch benchmark, per se, but rather a stitching-together of existing test sets — MedQA, PubMedQA, MedMCQA and so on — designed to probe models for general medical knowledge and related fields, such as anatomy, pharmacology, genetics and clinical practice. The benchmark contains multiple choice and open-ended questions that require medical reasoning and understanding, drawing from material including U.S. and Indian medical licensing exams and college biology test question banks.

“[Open Medical-LLM] enables researchers and practitioners to identify the strengths and weaknesses of different approaches, drive further advancements in the field and ultimately contribute to better patient care and outcome,” Hugging Face wrote in a blog post.

Image Credits: Hugging Face

Hugging Face is positioning the benchmark as a “robust assessment” of healthcare-bound generative AI models. But some medical experts on social media cautioned against putting too much stock into Open Medical-LLM, lest it lead to ill-informed deployments.

On X, Liam McCoy, a resident physician in neurology at the University of Alberta, pointed out that the gap between the “contrived environment” of medical question-answering and actual clinical practice can be quite large.

Hugging Face research scientist Clémentine Fourrier, who co-authored the blog post, agreed.

“These leaderboards should only be used as a first approximation of which [generative AI model] to explore for a given use case, but then a deeper phase of testing is always needed to examine the model’s limits and relevance in real conditions,” Fourrier replied on X. “Medical [models] should absolutely not be used on their own by patients, but instead should be trained to become support tools for MDs.”

It brings to mind Google’s experience when it tried to bring an AI screening tool for diabetic retinopathy to healthcare systems in Thailand.

Google created a deep learning system that scanned images of the eye, looking for evidence of retinopathy, a leading cause of vision loss. But despite high theoretical accuracy, the tool proved impractical in real-world testing, frustrating both patients and nurses with inconsistent results and a general lack of harmony with on-the-ground practices.

It’s telling that of the 139 AI-related medical devices the U.S. Food and Drug Administration has approved to date, none use generative AI. It’s exceptionally difficult to test how a generative AI tool’s performance in the lab will translate to hospitals and outpatient clinics, and, perhaps more importantly, how the outcomes might trend over time.

That’s not to suggest Open Medical-LLM isn’t useful or informative. The results leaderboard, if nothing else, serves as a reminder of just how poorly models answer basic health questions. But Open Medical-LLM, and no other benchmark for that matter, is a substitute for carefully thought-out real-world testing.




Software Development in Sri Lanka

Robotic Automations

NeuBird is building a generative AI solution for complex cloud-native environments | TechCrunch


NeuBird founders Goutham Rao and Vinod Jayaraman came from Portworx, a cloud-native storage solution they eventually sold to PureStorage in 2019 for $370 million. It was their third successful exit. 

When they went looking for their next startup challenge last year, they saw an opportunity to combine their cloud-native knowledge, especially around IT operations, with the burgeoning area of generative AI. 

Today NeuBird announced a $22 million investment from Mayfield to get the idea to market. It’s a hefty amount for an early-stage startup, but the firm is likely banking on the founders’ experience to build another successful company.

Rao, the CEO, says that while the cloud-native community has done a good job at solving a lot of difficult problems, it has created increasing levels of complexity along the way. 

“We’ve done an incredible job as a community over the past 10-plus years building cloud-native architectures with service-oriented designs. This added a lot of layers, which is good. That’s a proper way to design software, but this also came at a cost of increased telemetry. There’s just too many layers in the stack,” Rao told TechCrunch.

They concluded that this level of data was making it impossible for human engineers to find, diagnose and solve problems at scale inside large organizations. At the same time, large language models were beginning to mature, so the founders decided to put them to work on the problem.

“We’re leveraging large language models in a very unique way to be able to analyze thousands and thousands of metrics, alerts, logs, traces and application configuration information in a matter of seconds and be able to diagnose what the health of the environment is, detect if there’s a problem and come up with a solution,” he said.

The company is essentially building a trusted digital assistant to the engineering team. “So it’s a digital co-worker that works alongside SREs and ITOps engineers, and monitors all of the alerts and logs looking for issues,” he said. The goal is to reduce the amount of time it takes to respond to and solve an incident from hours to minutes, and they believe that by putting generative AI to work on the problem, they can help companies achieve that goal. 

The founders understand the limitations of large language models, and are looking to reduce hallucinated or incorrect responses by using a limited set of data to train the models, and by setting up other systems that help deliver more accurate responses.

“Because we’re using this in a very controlled manner for a very specific use case for environments we know, we can cross check the results that are coming out of the AI, again through a vector database and see if it’s even making sense and if we’re not comfortable with it, we won’t recommend it to the user.”

Customers can connect directly to their various cloud systems by entering their credentials, and without moving data, NeuBird can use the access to cross-check against other available information to come up with a solution, reducing the overall difficulty associated with getting the company-specific data for the model to work with. 

NeuBird uses various models, including Llama 2 for analyzing logs and metrics. They are using Mistral for other types of analysis. The company actually turns every natural language interaction into a SQL query, essentially turning unstructured data into structured data. They believe this will result in greater accuracy. 

The early-stage startup is working with design and alpha partners right now refining the idea as they work to bring the product to market later this year. Rao says they took a big chunk of money out of the gate because they wanted the room to work on the problem without having to worry about looking for more money too soon.


Software Development in Sri Lanka

Robotic Automations

Google injects generative AI into its cloud security tools | TechCrunch


At its annual Cloud Next conference in Las Vegas, Google on Tuesday introduced new cloud-based security products and services — in addition to updates to existing products and services — aimed at customers managing large, multi-tenant corporate networks.

Many of the announcements had to do with Gemini, Google’s flagship family of generative AI models.

For example, Google unveiled Gemini in Threat Intelligence, a new Gemini-powered component of the company’s Mandiant cybersecurity platform. Now in public preview, Gemini in Threat Intelligence can analyze large portions of potentially malicious code and let users perform natural language searches for ongoing threats or indicators of compromise, as well as summarize open source intelligence reports from around the web.

“Gemini in Threat Intelligence now offers conversational search across Mandiant’s vast and growing repository of threat intelligence directly from frontline investigations,” Sunil Potti, GM of cloud security at Google, wrote in a blog post shared with TechCrunch. “Gemini will navigate users to the most relevant pages in the integrated platform for deeper investigation … Plus, [Google’s malware detection service] VirusTotal now automatically ingests OSINT reports, which Gemini summarizes directly in the platform.”

Elsewhere, Gemini can now assist with cybersecurity investigations in Chronicle, Google’s cybersecurity telemetry offering for cloud customers. Set to roll out by the end of the month, the new capability guides security analysts through their typical workflows, recommending actions based on the context of a security investigation, summarizing security event data and creating breach and exploit detection rules from a chatbot-like interface.

And in Security Command Center, Google’s enterprise cybersecurity and risk management suite, a new Gemini-driven feature lets security teams search for threats using natural language while providing summaries of misconfigurations, vulnerabilities and possible attack paths.

Rounding out the security updates were privileged access manager (in preview), a service that offers just-in-time, time-bound and approval-based access options designed to help mitigate risks tied to privileged access misuse. Google’s also rolling out principal access boundary (in preview, as well), which lets admins implement restrictions on network root-level users so that those users can only access authorized resources within a specifically defined boundary.

Lastly, Autokey (in preview) aims to simplify creating and managing customer encryption keys for high-security use cases, while Audit Manager (also in preview) provides tools for Google Cloud customers in regulated industries to generate proof of compliance for their workloads and cloud-hosted data.

“Generative AI offers tremendous potential to tip the balance in favor of defenders,” Potti wrote in the blog post. “And we continue to infuse AI-driven capabilities into our products.”

Google isn’t the only company attempting to productize generative AI–powered security tooling. Microsoft last year launched a set of services that leverage generative AI to correlate data on attacks while prioritizing cybersecurity incidents. Startups, including Aim Security, are also jumping into the fray, aiming to corner the nascent space.

But with generative AI’s tendency to make mistakes, it remains to be seen whether these tools have staying power.


Software Development in Sri Lanka

Robotic Automations

Intel and others commit to building open generative AI tools for the enterprise | TechCrunch


Can generative AI designed for the enterprise (e.g. AI that autocompletes reports, spreadsheet formulas and so on) ever be interoperable? Along with a coterie of organizations including Cloudera and Intel, the Linux Foundation — the nonprofit organization that supports and maintains a growing number of open source efforts — aim to find out.

The Linux Foundation today announced the launch of the Open Platform for Enterprise AI (OPEA), a project to foster the development of open, multi-provider and composable (i.e. modular) generative AI systems. Under the purview of the Linux Foundation’s LFAI and Data org, which focuses on AI- and data-related platform initiatives, OPEA’s goal will be to pave the way for the release of “hardened,” “scalable” generative AI systems that “harness the best open source innovation from across the ecosystem,” LFAI and Data executive director Ibrahim Haddad said in a press release.

“OPEA will unlock new possibilities in AI by creating a detailed, composable framework that stands at the forefront of technology stacks,” Haddad said. “This initiative is a testament to our mission to drive open source innovation and collaboration within the AI and data communities under a neutral and open governance model.”

In addition to Cloudera and Intel, OPEA — one of the Linux Foundation’s Sandbox Projects, an incubator program of sorts — counts among its members enterprise heavyweights like Intel, IBM-owned Red Hat, Hugging Face, Domino Data Lab, MariaDB and VMWare.

So what might they build together exactly? Haddad hints at a few possibilities, such as “optimized” support for AI toolchains and compilers, which enable AI workloads to run across different hardware components, as well as “heterogeneous” pipelines for retrieval-augmented generation (RAG).

RAG is becoming increasingly popular in enterprise applications of generative AI, and it’s not difficult to see why. Most generative AI models’ answers and actions are limited to the data on which they’re trained. But with RAG, a model’s knowledge base can be extended to info outside the original training data. RAG models reference this outside info — which can take the form of proprietary company data, a public database or some combination of the two — before generating a response or performing a task.

A diagram explaining RAG models.

Intel offered a few more details in its own press release:

Enterprises are challenged with a do-it-yourself approach [to RAG] because there are no de facto standards across components that allow enterprises to choose and deploy RAG solutions that are open and interoperable and that help them quickly get to market. OPEA intends to address these issues by collaborating with the industry to standardize components, including frameworks, architecture blueprints and reference solutions.

Evaluation will also be a key part of what OPEA tackles.

In its GitHub repository, OPEA proposes a rubric for grading generative AI systems along four axes: performance, features, trustworthiness and “enterprise-grade” readiness. Performance as OPEA defines it pertains to “black-box” benchmarks from real-world use cases. Features is an appraisal of a system’s interoperability, deployment choices and ease of use. Trustworthiness looks at an AI model’s ability to guarantee “robustness” and quality. And enterprise readiness focuses on the requirements to get a system up and running sans major issues.

Rachel Roumeliotis, director of open source strategy at Intel, says that OPEA will work with the open source community to offer tests based on the rubric — and provide assessments and grading of generative AI deployments on request.

OPEA’s other endeavors are a bit up in the air at the moment. But Haddad floated the potential of open model development along the lines of Meta’s expanding Llama family and Databricks’ DBRX. Toward that end, in the OPEA repo, Intel has already contributed reference implementations for an generative-AI-powered chatbot, document summarizer and code generator optimized for its Xeon 6 and Gaudi 2 hardware.

Now, OPEA’s members are very clearly invested (and self-interested, for that matter) in building tooling for enterprise generative AI. Cloudera recently launched partnerships to create what it’s pitching as an “AI ecosystem” in the cloud. Domino offers a suite of apps for building and auditing business-forward generative AI. And VMWare — oriented toward the infrastructure side of enterprise AI — last August rolled out new “private AI” compute products.

The question is — under OPEA — will these vendors actually work together to build cross-compatible AI tools?

There’s an obvious benefit to doing so. Customers will happily draw on multiple vendors depending on their needs, resources and budgets. But history has shown that it’s all too easy to become inclined toward vendor lock-in. Let’s hope that’s not the ultimate outcome here.


Software Development in Sri Lanka

Robotic Automations

Adobe's working on generative video, too | TechCrunch


Adobe says it’s building an AI model to generate video. But it’s not revealing when this model will launch, exactly — or much about it besides the fact that it exists.

Offered as an answer of sorts to OpenAI’s Sora, Google’s Imagen 2 and models from the growing number of startups in the nascent generative AI video space, Adobe’s model — a part of the company’s expanding Firefly family of generative AI products — will make its way into Premiere Pro, Adobe’s flagship video editing suite, sometime later this year, Adobe says.

Like many generative AI video tools today, Adobe’s model creates footage from scratch (either a prompt or reference images) — and it powers three new features in Premiere Pro: object addition, object removal and generative extend.

They’re pretty self-explanatory.

Object addition lets users select a segment of a video clip — the upper third, say, or lower left corner — and enter a prompt to insert objects within that segment. In a briefing with TechCrunch, an Adobe spokesperson showed a still of a real-world briefcase filled with diamonds generated by Adobe’s model.

AI-generated diamonds, courtesy of Adobe.

Object removal removes objects from clips, like boom mics or coffee cups in the background of a shot.

Removing objects with AI. Notice the results aren’t quite perfect.

As for generative extend, it adds a few frames to the beginning or end of a clip (unfortunately, Adobe wouldn’t say how many frames). Generative extend isn’t meant to create whole scenes, but rather add buffer frames to sync up with a soundtrack or hold on to a shot for an extra beat — for instance to add emotional heft.

Image Credits: Adobe

To address fears of deepfakes that inevitably crops up around generative AI tools such as these, Adobe says it’s bringing Content Credentials — metadata to identify AI-generated media — to Premiere. Content Credentials, a media provenance standard that Adobe backs through its Content Authenticity Initiative, were already in Photoshop and a component of Adobe’s image-generating Firefly models. In Premiere, they’ll indicate not only which content was AI-generated but which AI model was used to generate it.

I asked Adobe what data — images, videos and so on — were used to train the model. The company wouldn’t say, nor would it say how (or whether) it’s compensating contributors to the data set.

Last week, Bloomberg, citing sources familiar with the matter, reported that Adobe’s paying photographers and artists on its stock media platform, Adobe Stock, up to $120 for submitting short video clips to train its video generation model. The pay’s said to range from around $2.62 per minute of video to around $7.25 per minute depending on the submission, with higher-quality footage commanding correspondingly higher rates.

That’d be a departure from Adobe’s current arrangement with Adobe Stock artists and photographers whose work it’s using to train its image generation models. The company pays those contributors an annual bonus, not a one-time fee, depending on the volume of content they have in Stock and how it’s being used — albeit a bonus that’s subject to an opaque formula and not guaranteed from year to year.

Bloomberg’s reporting, if accurate, depicts an approach in stark contrast to that of generative AI video rivals like OpenAI, which is said to have scraped publicly available web data — including videos from YouTube — to train its models. YouTube’s CEO, Neal Mohan, recently said that use of YouTube videos to train OpenAI’s text-to-video generator would be an infraction of the platform’s terms of service, highlighting the legal tenuousness of OpenAI’s and others’ fair use argument.

Companies including OpenAI are being sued over allegations that they’re violating IP law by training their AI on copyrighted content without providing the owners credit or pay. Adobe seems to be intent on avoiding this end, like its sometime generative AI competition Shutterstock and Getty Images (which also have arrangements to license model training data), and — with its IP indemnity policy — positioning itself as a verifiably “safe” option for enterprise customers.

On the subject of payment, Adobe isn’t saying how much it’ll cost customers to use the upcoming video generation features in Premiere; presumably, pricing’s still being hashed out. But the company did reveal that the payment scheme will follow the generative credits system established with its early Firefly models.

For customers with a paid subscription to Adobe Creative Cloud, generative credits renew beginning each month, with allotments ranging from 25 to 1,000 per month depending on the plan. More complex workloads (e.g. higher-resolution generated images or multiple-image generations) require more credits, as a general rule.

The big question in my mind is, will Adobe’s AI-powered video features be worth whatever they end up costing?

The Firefly image generation models so far have been widely derided as underwhelming and flawed compared to Midjourney, OpenAI’s DALL-E 3 and other competing tools. The lack of release time frame on the video model doesn’t instill a lot of confidence that it’ll avoid the same fate. Neither does the fact that Adobe declined to show me live demos of object addition, object removal and generative extend — insisting instead on a prerecorded sizzle reel.

Perhaps to hedge its bets, Adobe says that it’s in talks with third-party vendors about integrating their video generation models into Premiere, as well, to power tools like generative extend and more.

One of those vendors is OpenAI.

Adobe says it’s collaborating with OpenAI on ways to bring Sora into the Premiere workflow. (An OpenAI tie-up makes sense given the AI startup’s overtures to Hollywood recently; tellingly, OpenAI CTO Mira Murati will be attending the Cannes Film Festival this year.) Other early partners include Pika, a startup building AI tools to generate and edit videos, and Runway, which was one of the first vendors market with a generative video model.

An Adobe spokesperson said the company would be open to working with others in the future.

Now, to be crystal clear, these integrations are more of a thought experiment than a working product at present. Adobe stressed to me repeatedly that they’re in “early preview” and “research” rather than a thing customers can expect to play with anytime soon.

And that, I’d say, captures the overall tone of Adobe’s generative video presser.

Adobe’s clearly trying to signal with these announcements that it’s thinking about generative video, if only in the preliminary sense. It’d be foolish not to — to be caught flat-footed in the generative AI race is to risk losing out on a valuable potential new revenue stream, assuming the economics eventually work out in Adobe’s favors. (AI models are costly to train, run and serve after all.)

But what it’s showing — concepts — isn’t super compelling frankly. With Sora in the wild and surely more innovations coming down the pipeline, the company has much to prove.


Software Development in Sri Lanka

Robotic Automations

Generative AI is coming for healthcare, and not everyone's thrilled | TechCrunch


Generative AI, which can create and analyze images, text, audio, videos and more, is increasingly making its way into healthcare, pushed by both Big Tech firms and startups alike.

Google Cloud, Google’s cloud services and products division, is collaborating with Highmark Health, a Pittsburgh-based nonprofit healthcare company, on generative AI tools designed to personalize the patient intake experience. Amazon’s AWS division says it’s working with unnamed customers on a way to use generative AI to analyze medical databases for “social determinants of health.” And Microsoft Azure is helping to build a generative AI system for Providence, the not-for-profit healthcare network, to automatically triage messages to care providers sent from patients.  

Prominent generative AI startups in healthcare include Ambience Healthcare, which is developing a generative AI app for clinicians; Nabla, an ambient AI assistant for practitioners; and Abridge, which creates analytics tools for medical documentation.

The broad enthusiasm for generative AI is reflected in the investments in generative AI efforts targeting healthcare. Collectively, generative AI in healthcare startups have raised tens of millions of dollars in venture capital to date, and the vast majority of health investors say that generative AI has significantly influenced their investment strategies.

But both professionals and patients are mixed as to whether healthcare-focused generative AI is ready for prime time.

Generative AI might not be what people want

In a recent Deloitte survey, only about half (53%) of U.S. consumers said that they thought generative AI could improve healthcare — for example, by making it more accessible or shortening appointment wait times. Fewer than half said they expected generative AI to make medical care more affordable.

Andrew Borkowski, chief AI officer at the VA Sunshine Healthcare Network, the U.S. Department of Veterans Affairs’ largest health system, doesn’t think that the cynicism is unwarranted. Borkowski warned that generative AI’s deployment could be premature due to its “significant” limitations — and the concerns around its efficacy.

“One of the key issues with generative AI is its inability to handle complex medical queries or emergencies,” he told TechCrunch. “Its finite knowledge base — that is, the absence of up-to-date clinical information — and lack of human expertise make it unsuitable for providing comprehensive medical advice or treatment recommendations.”

Several studies suggest there’s credence to those points.

In a paper in the journal JAMA Pediatrics, OpenAI’s generative AI chatbot, ChatGPT, which some healthcare organizations have piloted for limited use cases, was found to make errors diagnosing pediatric diseases 83% of the time. And in testing OpenAI’s GPT-4 as a diagnostic assistant, physicians at Beth Israel Deaconess Medical Center in Boston observed that the model ranked the wrong diagnosis as its top answer nearly two times out of three.

Today’s generative AI also struggles with medical administrative tasks that are part and parcel of clinicians’ daily workflows. On the MedAlign benchmark to evaluate how well generative AI can perform things like summarizing patient health records and searching across notes, GPT-4 failed in 35% of cases.

OpenAI and many other generative AI vendors warn against relying on their models for medical advice. But Borkowski and others say they could do more. “Relying solely on generative AI for healthcare could lead to misdiagnoses, inappropriate treatments or even life-threatening situations,” Borkowski said.

Jan Egger, who leads AI-guided therapies at the University of Duisburg-Essen’s Institute for AI in Medicine, which studies the applications of emerging technology for patient care, shares Borkowski’s concerns. He believes that the only safe way to use generative AI in healthcare currently is under the close, watchful eye of a physician.

“The results can be completely wrong, and it’s getting harder and harder to maintain awareness of this,” Egger said. “Sure, generative AI can be used, for example, for pre-writing discharge letters. But physicians have a responsibility to check it and make the final call.”

Generative AI can perpetuate stereotypes

One particularly harmful way generative AI in healthcare can get things wrong is by perpetuating stereotypes.

In a 2023 study out of Stanford Medicine, a team of researchers tested ChatGPT and other generative AI–powered chatbots on questions about kidney function, lung capacity and skin thickness. Not only were ChatGPT’s answers frequently wrong, the co-authors found, but also answers included several reinforced long-held untrue beliefs that there are biological differences between Black and white people — untruths that are known to have led medical providers to misdiagnose health problems.

The irony is, the patients most likely to be discriminated against by generative AI for healthcare are also those most likely to use it.

People who lack healthcare coverage — people of color, by and large, according to a KFF study — are more willing to try generative AI for things like finding a doctor or mental health support, the Deloitte survey showed. If the AI’s recommendations are marred by bias, it could exacerbate inequalities in treatment.

However, some experts argue that generative AI is improving in this regard.

In a Microsoft study published in late 2023, researchers said they achieved 90.2% accuracy on four challenging medical benchmarks using GPT-4. Vanilla GPT-4 couldn’t reach this score. But, the researchers say, through prompt engineering — designing prompts for GPT-4 to produce certain outputs — they were able to boost the model’s score by up to 16.2 percentage points. (Microsoft, it’s worth noting, is a major investor in OpenAI.)

Beyond chatbots

But asking a chatbot a question isn’t the only thing generative AI is good for. Some researchers say that medical imaging could benefit greatly from the power of generative AI.

In July, a group of scientists unveiled a system called complementarity-driven deferral to clinical workflow (CoDoC), in a study published in Nature. The system is designed to figure out when medical imaging specialists should rely on AI for diagnoses versus traditional techniques. CoDoC did better than specialists while reducing clinical workflows by 66%, according to the co-authors. 

In November, a Chinese research team demoed Panda, an AI model used to detect potential pancreatic lesions in X-rays. A study showed Panda to be highly accurate in classifying these lesions, which are often detected too late for surgical intervention. 

Indeed, Arun Thirunavukarasu, a clinical research fellow at the University of Oxford, said there’s “nothing unique” about generative AI precluding its deployment in healthcare settings.

“More mundane applications of generative AI technology are feasible in the short- and mid-term, and include text correction, automatic documentation of notes and letters and improved search features to optimize electronic patient records,” he said. “There’s no reason why generative AI technology — if effective — couldn’t be deployed in these sorts of roles immediately.”

“Rigorous science”

But while generative AI shows promise in specific, narrow areas of medicine, experts like Borkowski point to the technical and compliance roadblocks that must be overcome before generative AI can be useful — and trusted — as an all-around assistive healthcare tool.

“Significant privacy and security concerns surround using generative AI in healthcare,” Borkowski said. “The sensitive nature of medical data and the potential for misuse or unauthorized access pose severe risks to patient confidentiality and trust in the healthcare system. Furthermore, the regulatory and legal landscape surrounding the use of generative AI in healthcare is still evolving, with questions regarding liability, data protection and the practice of medicine by non-human entities still needing to be solved.”

Even Thirunavukarasu, bullish as he is about generative AI in healthcare, says that there needs to be “rigorous science” behind tools that are patient-facing.

“Particularly without direct clinician oversight, there should be pragmatic randomized control trials demonstrating clinical benefit to justify deployment of patient-facing generative AI,” he said. “Proper governance going forward is essential to capture any unanticipated harms following deployment at scale.”

Recently, the World Health Organization released guidelines that advocate for this type of science and human oversight of generative AI in healthcare as well as the introduction of auditing, transparency and impact assessments on this AI by independent third parties. The goal, the WHO spells out in its guidelines, would be to encourage participation from a diverse cohort of people in the development of generative AI for healthcare and an opportunity to voice concerns and provide input throughout the process.

“Until the concerns are adequately addressed and appropriate safeguards are put in place,” Borkowski said, “the widespread implementation of medical generative AI may be … potentially harmful to patients and the healthcare industry as a whole.”


Software Development in Sri Lanka

Robotic Automations

Google goes all in on generative AI at Google Cloud Next | TechCrunch


This week in Las Vegas, 30,000 folks came together to hear the latest and greatest from Google Cloud. What they heard was all generative AI, all the time. Google Cloud is first and foremost a cloud infrastructure and platform vendor. If you didn’t know that, you might have missed it in the onslaught of AI news.

Not to minimize what Google had on display, but much like Salesforce last year at its New York City traveling road show, the company failed to give all but a passing nod to its core business — except in the context of generative AI, of course.

Google announced a slew of AI enhancements designed to help customers take advantage of the Gemini large language model (LLM) and improve productivity across the platform. It’s a worthy goal, of course, and throughout the main keynote on Day 1 and the Developer Keynote the following day, Google peppered the announcements with a healthy number of demos to illustrate the power of these solutions.

But many seemed a little too simplistic, even taking into account they needed to be squeezed into a keynote with a limited amount of time. They relied mostly on examples inside the Google ecosystem, when almost every company has much of their data in repositories outside of Google.

Some of the examples actually felt like they could have been done without AI. During an e-commerce demo, for example, the presenter called the vendor to complete an online transaction. It was designed to show off the communications capabilities of a sales bot, but in reality, the step could have been easily completed by the buyer on the website.

That’s not to say that generative AI doesn’t have some powerful use cases, whether creating code, analyzing a corpus of content and being able to query it, or being able to ask questions of the log data to understand why a website went down. What’s more, the task and role-based agents the company introduced to help individual developers, creative folks, employees and others, have the potential to take advantage of generative AI in tangible ways.

But when it comes to building AI tools based on Google’s models, as opposed to consuming the ones Google and other vendors are building for its customers, I couldn’t help feeling that they were glossing over a lot of the obstacles that could stand in the way of a successful generative AI implementation. While they tried to make it sound easy, in reality, it’s a huge challenge to implement any advanced technology inside large organizations.

Big change ain’t easy

Much like other technological leaps over the last 15 years — whether mobile, cloud, containerization, marketing automation, you name it — it’s been delivered with lots of promises of potential gains. Yet these advancements each introduce their own level of complexity, and large companies move more cautiously than we imagine. AI feels like a much bigger lift than Google, or frankly any of the large vendors, is letting on.

What we’ve learned with these previous technology shifts is that they come with a lot of hype and lead to a ton of disillusionment. Even after a number of years, we’ve seen large companies that perhaps should be taking advantage of these advanced technologies still only dabbling or even sitting out altogether, years after they have been introduced.

There are lots of reasons companies may fail to take advantage of technological innovation, including organizational inertia; a brittle technology stack that makes it hard to adopt newer solutions; or a group of corporate naysayers shutting down even the most well-intentioned initiatives, whether legal, HR, IT or other groups that, for a variety of reasons, including internal politics, continue to just say no to substantive change.

Vineet Jain, CEO at Egnyte, a company that concentrates on storage, governance and security, sees two types of companies: those that have made a significant shift to the cloud already and that will have an easier time when it comes to adopting generative AI, and those that have been slow movers and will likely struggle.

He talks to plenty of companies that still have a majority of their tech on-prem and have a long way to go before they start thinking about how AI can help them. “We talk to many ‘late’ cloud adopters who have not started or are very early in their quest for digital transformation,” Jain told TechCrunch.

AI could force these companies to think hard about making a run at digital transformation, but they could struggle starting from so far behind, he said. “These companies will need to solve those problems first and then consume AI once they have a mature data security and governance model,” he said.

It was always the data

The big vendors like Google make implementing these solutions sound simple, but like all sophisticated technology, looking simple on the front end doesn’t necessarily mean it’s uncomplicated on the back end. As I heard often this week, when it comes to the data used to train Gemini and other large language models, it’s still a case of “garbage in, garbage out,” and that’s even more applicable when it comes to generative AI.

It starts with data. If you don’t have your data house in order, it’s going to be very difficult to get it into shape to train the LLMs on your use case. Kashif Rahamatullah, a Deloitte principal who is in charge of the Google Cloud practice at his firm, was mostly impressed by Google’s announcements this week, but still acknowledged that some companies that lack clean data will have problems implementing generative AI solutions. “These conversations can start with an AI conversation, but that quickly turns into: ‘I need to fix my data, and I need to get it clean, and I need to have it all in one place, or almost one place, before I start getting the true benefit out of generative AI,” Rahamatullah said.

From Google’s perspective, the company has built generative AI tools to more easily help data engineers build data pipelines to connect to data sources inside and outside of the Google ecosystem. “It’s really meant to speed up the data engineering teams, by automating many of the very labor-intensive tasks involved in moving data and getting it ready for these models,” Gerrit Kazmaier, vice president and general manager for database, data analytics and Looker at Google, told TechCrunch.

That should be helpful in connecting and cleaning data, especially in companies that are further along the digital transformation journey. But for those companies like the ones Jain referenced — those that haven’t taken meaningful steps toward digital transformation — it could present more difficulties, even with these tools Google has created.

All of that doesn’t even take into account that AI comes with its own set of challenges beyond pure implementation, whether it’s an app based on an existing model, or especially when trying to build a custom model, says Andy Thurai, an analyst at Constellation Research. “While implementing either solution, companies need to think about governance, liability, security, privacy, ethical and responsible use and compliance of such implementations,” Thurai said. And none of that is trivial.

Executives, IT pros, developers and others who went to GCN this week might have gone looking for what’s coming next from Google Cloud. But if they didn’t go looking for AI, or they are simply not ready as an organization, they may have come away from Sin City a little shell-shocked by Google’s full concentration on AI. It could be a long time before organizations lacking digital sophistication can take full advantage of these technologies, beyond the more-packaged solutions being offered by Google and other vendors.


Software Development in Sri Lanka

Robotic Automations

Databricks spent $10M on new DBRX generative AI model | TechCrunch


If you wanted to raise the profile of your major tech company and had $10 million to spend, how would you spend it? On a Super Bowl ad? An F1 sponsorship?

You could spend it training a generative AI model. While not marketing in the traditional sense, generative models are attention grabbers — and increasingly funnels to vendors’ bread-and-butter products and services.

See Databricks’ DBRX, a new generative AI model announced today akin to OpenAI’s GPT series and Google’s Gemini. Available on GitHub and the AI dev platform Hugging Face for research as well as for commercial use, base (DBRX Base) and fine-tuned (DBRX Instruct) versions of DBRX can be run and tuned on public, custom or otherwise proprietary data.

“DBRX was trained to be useful and provide information on a wide variety of topics,” Naveen Rao, VP of generative AI at Databricks, told TechCrunch in an interview. “DBRX has been optimized and tuned for English language usage, but is capable of conversing and translating into a wide variety of languages, such as French, Spanish and German.”

Databricks describes DBRX as “open source” in a similar vein as “open source” models like Meta’s Llama 2 and AI startup Mistral’s models. (It’s the subject of robust debate as to whether these models truly meet the definition of open source.)

Databricks says that it spent roughly $10 million and two months training DBRX, which it claims (quoting from a press release) “outperform[s] all existing open source models on standard benchmarks.”

But — and here’s the marketing rub — it’s exceptionally hard to use DBRX unless you’re a Databricks customer.

That’s because, in order to run DBRX in the standard configuration, you need a server or PC with at least four Nvidia H100 GPUs (or any other configuration of GPUs that add up to around 320GB of memory). A single H100 costs thousands of dollars — quite possibly more. That might be chump change to the average enterprise, but for many developers and solopreneurs, it’s well beyond reach.

It’s possible to run the model on a third-party cloud, but the hardware requirements are still pretty steep — for example, there’s only one instance type on the Google Cloud that incorporates H100 chips. Other clouds may cost less, but generally speaking running huge models like this is not cheap today.

And there’s fine print to boot. Databricks says that companies with more than 700 million active users will face “certain restrictions” comparable to Meta’s for Llama 2, and that all users will have to agree to terms ensuring that they use DBRX “responsibly.” (Databricks hadn’t volunteered those terms’ specifics as of publication time.)

Databricks presents its Mosaic AI Foundation Model product as the managed solution to these roadblocks, which in addition to running DBRX and other models provides a training stack for fine-tuning DBRX on custom data. Customers can privately host DBRX using Databricks’ Model Serving offering, Rao suggested, or they can work with Databricks to deploy DBRX on the hardware of their choosing.

Rao added:

“We’re focused on making the Databricks platform the best choice for customized model building, so ultimately the benefit to Databricks is more users on our platform. DBRX is a demonstration of our best-in-class pre-training and tuning platform, which customers can use to build their own models from scratch. It’s an easy way for customers to get started with the Databricks Mosaic AI generative AI tools. And DBRX is highly capable out-of-the-box and can be tuned for excellent performance on specific tasks at better economics than large, closed models.”

Databricks claims DBRX runs up to 2x faster than Llama 2, in part thanks to its mixture of experts (MoE) architecture. MoE — which DBRX shares in common with Mistral’s newer models and Google’s recently announced Gemini 1.5 Pro — basically breaks down data processing tasks into multiple subtasks and then delegates these subtasks to smaller, specialized “expert” models.

Most MoE models have eight experts. DBRX has 16, which Databricks says improves quality.

Quality is relative, however.

While Databricks claims that DBRX outperforms Llama 2 and Mistral’s models on certain language understanding, programming, math and logic benchmarks, DBRX falls short of arguably the leading generative AI model, OpenAI’s GPT-4, in most areas outside of niche use cases like database programming language generation.

Now, as some on social media have pointed out, DBRX and GPT-4, which cost significantly more to train, are very different — perhaps too different to warrant a direct comparison. It’s important that these large, enterprise-funded models get compared to the best of the field, but what distinguishes them should also be pointed out, like the fact that DBRX is “open source” and targeted at a distinctly enterprise audience.

At the same time, it can’t be ignored that DBRX is somewhat close to flagship models like GPT-4 in that it’s cost-prohibitive for the average person to run, its training data isn’t open and it isn’t open source in the strictest definition.

Rao admits that DBRX has other limitations as well, namely that it — like all other generative AI models — can fall victim to “hallucinating” answers to queries despite Databricks’ work in safety testing and red teaming. Because the model was simply trained to associate words or phrases with certain concepts, if those associations aren’t totally accurate, its responses won’t always be accurate.

Also, DBRX is not multimodal, unlike some more recent flagship generative AI models, including Gemini. (It can only process and generate text, not images.) And we don’t know exactly what sources of data were used to train it; Rao would only reveal that no Databricks customer data was used in training DBRX.

“We trained DBRX on a large set of data from a diverse range of sources,” he added. “We used open data sets that the community knows, loves and uses every day.”

I asked Rao if any of the DBRX training data sets were copyrighted or licensed, or show obvious signs of biases (e.g. racial biases), but he didn’t answer directly, saying only, “We’ve been careful about the data used, and conducted red teaming exercises to improve the model’s weaknesses.” Generative AI models have a tendency to regurgitate training data, a major concern for commercial users of models trained on unlicensed, copyrighted or very clearly biased data. In the worst-case scenario, a user could end up on the ethical and legal hooks for unwittingly incorporating IP-infringing or biased work from a model into their projects.

Some companies training and releasing generative AI models offer policies covering the legal fees arising from possible infringement. Databricks doesn’t at present — Rao says that the company’s “exploring scenarios” under which it might.

Given this and the other aspects in which DBRX misses the mark, the model seems like a tough sell to anyone but current or would-be Databricks customers. Databricks’ rivals in generative AI, including OpenAI, offer equally if not more compelling technologies at very competitive pricing. And plenty of generative AI models come closer to the commonly understood definition of open source than DBRX.

Rao promises that Databricks will continue to refine DBRX and release new versions as the company’s Mosaic Labs R&D team — the team behind DBRX — investigates new generative AI avenues.

“DBRX is pushing the open source model space forward and challenging future models to be built even more efficiently,” he said. “We’ll be releasing variants as we apply techniques to improve output quality in terms of reliability, safety and bias … We see the open model as a platform on which our customers can build custom capabilities with our tools.”

Judging by where DBRX now stands relative to its peers, it’s an exceptionally long road ahead.

This story was corrected to note that the model took two months to train, and removed an incorrect reference to Llama 2 in the fourteenth paragraph. We regret the errors.


Software Development in Sri Lanka

Robotic Automations

Google.org launches $20M generative AI accelerator program | TechCrunch


Google.org, Google’s charitable wing, is launching a new program to help fund nonprofits developing tech that leverages generative AI.

Called Google.org Accelerator: Generative AI, the program is to be funded by $20 million in grants and include 21 nonprofits to start, including Quill.org, a company creating AI-powered tools for student writing feedback, and World Bank, which is building a generative AI app to make development research more accessible.

In addition to funding, nonprofits in the six-month accelerator program will get access to technical training, workshops, mentors and guidance from an “AI coach.” And, through Google.org’s fellowship program, teams of Google employees will work with three of the nonprofits — Tarjimly, Benefits Data Trust and mRelief — full-time for up to six months to help launch their proposed generative AI tools.

Tarjimly aims to use AI to translate languages for refugees, while Benefits Data Trust is tapping AI to create assistants that support caseworkers in helping low-income applicants enroll in public benefits. mRelief, meanwhile, is designing a tool to streamline the U.S. SNAP benefits application process.

“Generative AI can help social impact teams be more productive, creative and effective in serving their communities,” Annie Lewin, director of global advocacy at Google.org, said in a blog post. “Google.org funding recipients report that AI helps them achieve their goals in one third of the time at nearly half the cost.”

According to a PwrdBy survey, 73% of nonprofits believe AI innovation aligns with their missions and 75% believe AI makes their lives easier, particularly in areas like donor categorization, routine back-office tasks and “mission-driven” initiatives. But there remain significant barriers for nonprofits looking to build their own AI solutions or adopt third-party products — chiefly cost, resources and time.

In the blog post, Lewin cites a Google.org survey that similarly found that, while four in five nonprofits think generative AI may be applicable to their work, nearly half currently aren’t using the tech as a result of a range of internal and external roadblocks. “[These nonprofits] cite a lack of tools, awareness, training and funding as the biggest barriers to adoption,” she said.

Encouragingly, the number of nonprofit AI-focused startups is beginning to tick up.

Nonprofit accelerator Fast Forward said that this year, more than a third of applicants for its latest class were AI companies. And Crunchbase reports that, more broadly, dozens of nonprofit organizations across the globe are dedicating work around ethical approaches to AI, like AI ethics lab AlgorithmWatch, virtual reading clinic JoyEducation and conservation advocacy group Earth05.


Software Development in Sri Lanka

Robotic Automations

Google Gemini: Everything you need to know about the new generative AI platform | TechCrunch


Google’s trying to make waves with Gemini, a new generative AI platform that recently made its big debut. But while Gemini appears to be promising in a few aspects, it’s falling short in others. So what is Gemini? How can you use it? And how does it stack up to the competition?

To make it easier to keep up with the latest Gemini developments, we’ve put together this handy guide, which we’ll keep updated as new Gemini models and features are released.

What is Gemini?

Gemini is Google’s long-promised, next-gen generative AI model family, developed by Google’s AI research labs DeepMind and Google Research. It comes in three flavors:

  • Gemini Ultra, the flagship Gemini model
  • Gemini Pro, a “lite” Gemini model
  • Gemini Nano, a smaller “distilled” model that runs on mobile devices like the Pixel 8 Pro

All Gemini models were trained to be “natively multimodal” — in other words, able to work with and use more than just text. They were pre-trained and fine-tuned on a variety audio, images and videos, a large set of codebases, and text in different languages.

That sets Gemini apart from models such as Google’s own large language model LaMDA, which was only trained on text data. LaMDA can’t understand or generate anything other than text (e.g. essays, email drafts and so on) — but that isn’t the case with Gemini models. Their ability to understand images, audio and other modalities is still limited, but it’s better than nothing.

What’s the difference between Bard and Gemini?

Image Credits: Google

Google, proving once again that it lacks a knack for branding, didn’t make it clear from the outset that Gemini is separate and distinct from Bard. Bard is simply an interface through which certain Gemini models can be accessed — think of it as an app or client for Gemini and other gen AI models. Gemini, on the other hand, is a family of models — not an app or frontend. There’s no standalone Gemini experience, nor will there likely ever be. If you were to compare to OpenAI’s products, Bard corresponds to ChatGPT, OpenAI’s popular conversational AI app, and Gemini corresponds to the language model that powers it, which in ChatGPT’s case is GPT-3.5 or 4.

Incidentally, Gemini is also totally independent from Imagen-2, a text-to-image model that may or may not fit into the company’s overall AI strategy. Don’t worry, you’re not the only one confused by this!

What can Gemini do?

Because the Gemini models are multimodal, they can in theory perform a range of tasks, from transcribing speech to captioning images and videos to generating artwork. Few of these capabilities have reached the product stage yet (more on that later), but Google’s promising all of them — and more — at some point in the not-too-distant future.

Of course, it’s a bit hard to take the company at its word.

Google seriously under-delivered with the original Bard launch. And more recently it ruffled feathers with a video purporting to show Gemini’s capabilities that turned out to have been heavily doctored and was more or less aspirational. Gemini is, to the tech giant’s credit, available in some form today — but a rather limited form.

Still, assuming Google is being more or less truthful with its claims, here’s what the different tiers of Gemini models will be able to do once they’re released:

Gemini Ultra

Few people have gotten their hands on Gemini Ultra, the “foundation” model on which the others are built, so far — just a “select set” of customers across a handful of Google apps and services. That won’t change until sometime later this year, when Google’s largest model launches more broadly. Most info about Ultra has come from Google-led product demos, so it’s best taken with a grain of salt.

Google says that Gemini Ultra can be used to help with things like physics homework, solving problems step-by-step on a worksheet and pointing out possible mistakes in already filled-in answers. Gemini Ultra can also be applied to tasks such as identifying scientific papers relevant to a particular problem, Google says — extracting information from those papers and “updating” a chart from one by generating the formulas necessary to recreate the chart with more recent data.

Gemini Ultra technically supports image generation, as alluded to earlier. But that capability won’t make its way into the productized version of the model at launch, according to Google — perhaps because the mechanism is more complex than how apps such as ChatGPT generate images. Rather than feed prompts to an image generator (like DALL-E 3, in ChatGPT’s case), Gemini outputs images “natively” without an intermediary step.

Gemini Pro

Unlike Gemini Ultra, Gemini Pro is available publicly today. But confusingly, its capabilities depend on where it’s used.

Google says that in Bard, where Gemini Pro launched first in text-only form, the model is an improvement over LaMDA in its reasoning, planning and understanding capabilities. An independent study by Carnegie Mellon and BerriAI researchers found that Gemini Pro is indeed better than OpenAI’s GPT-3.5 at handling longer and more complex reasoning chains.

But the study also found that, like all large language models, Gemini Pro particularly struggles with math problems involving several digits, and users have found plenty of examples of bad reasoning and mistakes. It made plenty of factual errors for simple queries like who won the latest Oscars. Google has promised improvements, but it’s not clear when they’ll arrive.

Gemini Pro is also available via API in Vertex AI, Google’s fully managed AI developer platform, which accepts text as input and generates text as output. An additional endpoint, Gemini Pro Vision, can process text and imagery — including photos and video — and output text along the lines of OpenAI’s GPT-4 with Vision model.

Using Gemini Pro in Vertex AI.

Within Vertex AI, developers can customize Gemini Pro to specific contexts and use cases using a fine-tuning or “grounding” process. Gemini Pro can also be connected to external, third-party APIs to perform particular actions.

Sometime in “early 2024,” Vertex customers will be able to tap Gemini Pro to power custom-built conversational voice and chat agents (i.e. chatbots). Gemini Pro will also become an option for driving search summarization, recommendation and answer generation features in Vertex AI, drawing on documents across modalities (e.g. PDFs, images) from different sources (e.g. OneDrive, Salesforce) to satisfy queries.

Image Credits: Gemini

In AI Studio, Google’s web-based tool for app and platform developers, there’s workflows for creating freeform, structured and chat prompts using Gemini Pro. Developers have access to both Gemini Pro and the Gemini Pro Vision endpoints, and they can adjust the model temperature to control the output’s creative range and provide examples to give tone and style instructions — and also tune the safety settings.

Gemini Nano

Gemini Nano is a much smaller version of the Gemini Pro and Ultra models, and it’s efficient enough to run directly on (some) phones instead of sending the task to a server somewhere. So far it powers two features on the Pixel 8 Pro: Summarize in Recorder and Smart Reply in Gboard.

The Recorder app, which lets users push a button to record and transcribe audio, includes a Gemini-powered summary of your recorded conversations, interviews, presentations and other snippets. Users get these summaries even if they don’t have a signal or Wi-Fi connection available — and in a nod to privacy, no data leaves their phone in the process.

Gemini Nano is also in Gboard, Google’s keyboard app, as a developer preview. There, it powers a feature called Smart Reply, which helps to suggest the next thing you’ll want to say when having a conversation in a messaging app. The feature initially only works with WhatsApp, but will come to more apps in 2024, Google says.

Is Gemini better than OpenAI’s GPT-4?

There’s no way to know how the Gemini family really stacks up until Google releases Ultra later this year, but the company has claimed improvements on the state of the art — which is usually OpenAI’s GPT-4.

Google has several times touted Gemini’s superiority on benchmarks, claiming that Gemini Ultra exceeds current state-of-the-art results on “30 of the 32 widely used academic benchmarks used in large language model research and development.” The company says that Gemini Pro, meanwhile, is more capable at tasks like summarizing content, brainstorming and writing than GPT-3.5.

But leaving aside the question of whether benchmarks really indicate a better model, the scores Google points to appear to be only marginally better than OpenAI’s corresponding models. And — as mentioned earlier — some early impressions haven’t been great, with users and academics pointing out that Gemini Pro tends to get basic facts wrong, struggles with translations, and gives poor coding suggestions.

How much will Gemini cost?

Gemini Pro is free to use in Bard and, for now, AI Studio and Vertex AI.

Once Gemini Pro exits preview in Vertex, however, the model will cost $0.0025 per character while output will cost $0.00005 per character. Vertex customers pay per 1,000 characters (about 140 to 250 words) and, in the case of models like Gemini Pro Vision, per image ($0.0025).

Let’s assume a 500-word article contains 2,000 characters. Summarizing that article with Gemini Pro would cost $5. Meanwhile, generating an article of a similar length would cost $0.1.

Where you can try Gemini?

Gemini Pro

The easiest place to experience Gemini Pro is in Bard. A fine-tuned version of Pro is answering text-based Bard queries in English in the U.S. right now, with additional languages and supported countries set to arrive down the line.

Gemini Pro is also accessible in preview in Vertex AI via an API. The API is free to use “within limits” for the time being and supports 38 languages and regions including Europe, as well as features like chat functionality and filtering.

Elsewhere, Gemini Pro can be found in AI Studio. Using the service, developers can iterate prompts and Gemini-based chatbots and then get API keys to use them in their apps — or export the code to a more fully featured IDE.

Duet AI for Developers, Google’s suite of AI-powered assistance tools for code completion and generation, will start using a Gemini model in the coming weeks. And Google plans to bring Gemini models to dev tools for Chrome and its Firebase mobile dev platform around the same time, in early 2024.

Gemini Nano

Gemini Nano is on the Pixel 8 Pro — and will come to other devices in the future. Developers interested in incorporating the model into their Android apps can sign up for a sneak peek.

We’ll keep this post up to date with the latest developments.


Software Development in Sri Lanka

Back
WhatsApp
Messenger
Viber