One year ago, the IT Division’s ScienceIT Department, which supports scientists at the Lab with their data and computation needs, began discussions in earnest about what AI tools might be most valuable to researchers at the Lab, and how to make those tools available. In response to growing interest in the evolution of artificial intelligence capabilities and its potential for science, the team felt the urgency to make AI tools easy for any Lab researcher to access and use.
With input from research teams, ScienceIT decided to focus on the following key offerings: easy, one-stop access to commercial models; an on-premises platform for cases where data privacy was a concern; and an API service to allow research teams to build their own large language model (LLM)-powered applications.
Andrew Schmeder, a consultant with ScienceIT tasked with managing the development of the suite of AI tools, said, “We wanted to reduce barriers to entry, allowing researchers to engage with AI tools instantly. Commercial models often have complex licensing terms and require researchers to go through a procurement process. By offering these services through ScienceIT, they can jump right in. Also, in many cases, data privacy concerns have kept potential users from engaging with these commercial AI tools, so we wanted to create self-hosted AI systems where the data does not leave the Lab network. Finally, we also wanted to empower those who wanted to develop their own LLM-powered applications by providing APIs.”
ScienceIT consultant Tim Fong noted that the team was right to have made plans early on. To make the on-premise platform possible, significant high performance computing and networking equipment was needed, and with the skyrocketing global demand for AI hardware, the waitlist for equipment like the Nvidia DGX H100 GPU (a powerful system purpose-built for all AI infrastructure and workloads) turned out to be almost a year long. Ultimately the Lab invested $1.2-1.5 million in supercomputing and networking hardware needed to run the on-premise large language models.
“The equipment arrived just as the inquiries about AI tools reached a crescendo,” said Tim. “Because the team was very familiar with high performance computing and networking systems, we were able to quickly get the machines up on the network and functioning.”
In early August, ScienceIT introduced the CBORG AI Portal, offering researchers at the Lab access to a comprehensive set of AI models, including access to commercial AI services (like ChatGPT 4o and Google Gemini Pro), Lab-hosted open models for free, on-premises use, and API services and technical support so that researchers can develop their own LLM-powered applications.
“CBORG is a central point of access that lets everyone jump in quickly with a minimum of roadblocks between their idea and ability to execute against that idea,” said Andrew.
The Many Uses of AI to Support and Enhance Research
Since its launch, CBORG already has attracted more than 450 users. Many are using the tools to support software development and troubleshooting, a valuable function for the Lab that AI models can offer.
Said Zhong Wang, a staff scientist at the Joint Genome Institute, “I leveraged ChatGPT and the other models that ScienceIT built to get my students started quickly in their research projects. My research requires both strong programming skills and a good understanding of genomics, but few students are equipped with both. This year I adopted a ‘learning on-demand’ strategy, where I encouraged the students to focus on solving the problem while using AI to provide them with the necessary knowledge and skill assistance in real time. I mentored three students with either data science or biology backgrounds, and everyone started to work on the research project during the first week! At the end of their internships, all gave fabulous presentations with plenty of research outcomes.”
Tim, who is a resource to researchers who are thinking about how best to use the AI tools and what tools to use to bring to bear on particular problems, said, “AI tools are also well-suited to processing large amounts of data, automating manual, time-consuming tasks.”
For example, Julie Mulvaney Kemp, a scientist in ETA’s Energy Markets and Policy Department, is using the on-premises platform LLAMA 3.1 to extract solar power plant information (including the power generation capacity, the generation technology used, the project cost, and the details of any energy storage paired with the plant) from news articles.
The ScienceIT team sees further opportunities for researchers, including using the chat interface and the API service to build their own large language model (LLM)-powered applications to batch process data, mine patterns, or connect to code generation agents that will write code. The API service currently has 35 users, and the ScienceIT team expects that this number will grow quickly.
“We’re just beginning to understand the many uses of AI to speed up research,” said Tim.
Stay Tuned for More from CBORG
The ScienceIT team is continuing to develop new offerings for CBORG. One project involves developing applications based on “retrieval augmented generation” or “RAG,” where chatbots can, for example, look for relevant content that isn’t online, but resides in the Lab’s rich databases and information repositories.
Further down the road, CBORG could provide more advanced software programming assistance, integrating chat tools and code completion tools into code development programs. Agent-based systems can implement complex projects, for example, by accessing existing code on a machine, uploading the relevant content to the model, developing and testing options for new code, and then presenting an action plan.
To help the IT team meet current and plan for future AI needs, IT is seeking input from Areas, Divisions, and research groups. To discuss these needs with IT or for any questions about CBORG or AI services from IT, contact scienceit@lbl.gov.
For more information:
Learn more about AI services for productivity and science at go.lbl.gov/ai
Access the CBORG AI Portal
Read the CBORG launch story and FAQ
Watch the recorded webinar on “Building Infrastructure for Cutting-Edge Scientific AI” from Aug. 29 (includes discussion of NERSC and IT AI infrastructure and resources)