AWS Machine Learning Blog Official Machine Learning Blog of Amazon Web Services

  • Mixtral 8x22B is now available in Amazon SageMaker JumpStart
    by Marco Punio on May 17, 2024 at 4:02 pm

    Today, we are excited to announce the Mixtral-8x22B large language model (LLM), developed by Mistral AI, is available for customers through Amazon SageMaker JumpStart to deploy with one click for running inference. You can try out this model with SageMaker JumpStart, a machine learning (ML) hub that provides access to algorithms and models so you

  • Building Generative AI prompt chaining workflows with human in the loop
    by Veda Raman on May 17, 2024 at 3:51 pm

    While Generative AI can create highly realistic content, including text, images, and videos, it can also generate outputs that appear plausible but are verifiably incorrect. Incorporating human judgment is crucial, especially in complex and high-risk decision-making scenarios. This involves building a human-in-the-loop process where humans play an active role in decision making alongside the AI system. In this blog post, you will learn about prompt chaining, how to break a complex task into multiple tasks to use prompt chaining with an LLM in a specific order, and how to involve a human to review the response generated by the LLM.

  • How LotteON built a personalized recommendation system using Amazon SageMaker and MLOps
    by SeungBum Shim on May 16, 2024 at 4:13 pm

    This post is co-written with HyeKyung Yang, Jieun Lim, and SeungBum Shim from LotteON. LotteON aims to be a platform that not only sells products, but also provides a personalized recommendation experience tailored to your preferred lifestyle. LotteON operates various specialty stores, including fashion, beauty, luxury, and kids, and strives to provide a personalized shopping

  • Build a serverless exam generator application from your own lecture content using Amazon Bedrock
    by Merieme Ezzaouia on May 15, 2024 at 4:21 pm

    Crafting new questions for exams and quizzes can be tedious and time-consuming for educators. The time required varies based on factors like subject matter, question types, experience level, and class level. Multiple-choice questions require substantial time to generate quality distractors and ensure a single unambiguous answer, and composing effective true-false questions demands careful effort to

  • Accelerate NLP inference with ONNX Runtime on AWS Graviton processors
    by Sunita Nadampalli on May 15, 2024 at 4:03 pm

    ONNX is an open source machine learning (ML) framework that provides interoperability across a wide range of frameworks, operating systems, and hardware platforms. ONNX Runtime is the runtime engine used for model inference and training with ONNX. AWS Graviton3 processors are optimized for ML workloads, including support for bfloat16, Scalable Vector Extension (SVE), and Matrix


  • Modeling Extremely Large Images with xT
    on March 21, 2024 at 9:00 am

    As computer vision researchers, we believe that every pixel can tell a story. However, there seems to be a writer’s block settling into the field when it comes to dealing with large images. Large images are no longer rare—the cameras we carry in our pockets and those orbiting our planet snap pictures so big and detailed that they stretch our current best models and hardware to their breaking points when handling them. Generally, we face a quadratic increase in memory usage as a function of image size. Today, we make one of two sub-optimal choices when handling large images: down-sampling or cropping. These two methods incur significant losses in the amount of information and context present in an image. We take another look at these approaches and introduce $x$T, a new framework to model large images end-to-end on contemporary GPUs while effectively aggregating global context with local details. Architecture for the $x$T framework. Why Bother with Big Images Anyway? Why bother handling large images anyways? Picture yourself in front of your TV, watching your favorite football team. The field is dotted with players all over with action occurring only on a small portion of the screen at a time. Would you be satisified, however, if you could only see a small region around where the ball currently was? Alternatively, would you be satisified watching the game in low resolution? Every pixel tells a story, no matter how far apart they are. This is true in all domains from your TV screen to a pathologist viewing a gigapixel slide to diagnose tiny patches of cancer. These images are treasure troves of information. If we can’t fully explore the wealth because our tools can’t handle the map, what’s the point? Sports are fun when you know what’s going on. That’s precisely where the frustration lies today. The bigger the image, the more we need to simultaneously zoom out to see the whole picture and zoom in for the nitty-gritty details, making it a challenge to grasp both the forest and the trees simultaneously. Most current methods force a choice between losing sight of the forest or missing the trees, and neither option is great. How $x$T Tries to Fix This Imagine trying to solve a massive jigsaw puzzle. Instead of tackling the whole thing at once, which would be overwhelming, you start with smaller sections, get a good look at each piece, and then figure out how they fit into the bigger picture. That’s basically what we do with large images with $x$T. $x$T takes these gigantic images and chops them into smaller, more digestible pieces hierarchically. This isn’t just about making things smaller, though. It’s about understanding each piece in its own right and then, using some clever techniques, figuring out how these pieces connect on a larger scale. It’s like having a conversation with each part of the image, learning its story, and then sharing those stories with the other parts to get the full narrative. Nested Tokenization At the core of $x$T lies the concept of nested tokenization. In simple terms, tokenization in the realm of computer vision is akin to chopping up an image into pieces (tokens) that a model can digest and analyze. However, $x$T takes this a step further by introducing a hierarchy into the process—hence, nested. Imagine you’re tasked with analyzing a detailed city map. Instead of trying to take in the entire map at once, you break it down into districts, then neighborhoods within those districts, and finally, streets within those neighborhoods. This hierarchical breakdown makes it easier to manage and understand the details of the map while keeping track of where everything fits in the larger picture. That’s the essence of nested tokenization—we split an image into regions, each which can be split into further sub-regions depending on the input size expected by a vision backbone (what we call a region encoder), before being patchified to be processed by that region encoder. This nested approach allows us to extract features at different scales on a local level. Coordinating Region and Context Encoders Once an image is neatly divided into tokens, $x$T employs two types of encoders to make sense of these pieces: the region encoder and the context encoder. Each plays a distinct role in piecing together the image’s full story. The region encoder is a standalone “local expert” which converts independent regions into detailed representations. However, since each region is processed in isolation, no information is shared across the image at large. The region encoder can be any state-of-the-art vision backbone. In our experiments we have utilized hierarchical vision transformers such as Swin and Hiera and also CNNs such as ConvNeXt! Enter the context encoder, the big-picture guru. Its job is to take the detailed representations from the region encoders and stitch them together, ensuring that the insights from one token are considered in the context of the others. The context encoder is generally a long-sequence model. We experiment with Transformer-XL (and our variant of it called Hyper) and Mamba, though you could use Longformer and other new advances in this area. Even though these long-sequence models are generally made for language, we demonstrate that it is possible to use them effectively for vision tasks. The magic of $x$T is in how these components—the nested tokenization, region encoders, and context encoders—come together. By first breaking down the image into manageable pieces and then systematically analyzing these pieces both in isolation and in conjunction, $x$T manages to maintain the fidelity of the original image’s details while also integrating long-distance context the overarching context while fitting massive images, end-to-end, on contemporary GPUs. Results We evaluate $x$T on challenging benchmark tasks that span well-established computer vision baselines to rigorous large image tasks. Particularly, we experiment with iNaturalist 2018 for fine-grained species classification, xView3-SAR for context-dependent segmentation, and MS-COCO for detection. Powerful vision models used with $x$T set a new frontier on downstream tasks such as fine-grained species classification. Our experiments show that $x$T can achieve higher accuracy on all downstream tasks with fewer parameters while using much less memory per region than state-of-the-art baselines*. We are able to model images as large as 29,000 x 25,000 pixels large on 40GB A100s while comparable baselines run out of memory at only 2,800 x 2,800 pixels. Powerful vision models used with $x$T set a new frontier on downstream tasks such as fine-grained species classification. *Depending on your choice of context model, such as Transformer-XL. Why This Matters More Than You Think This approach isn’t just cool; it’s necessary. For scientists tracking climate change or doctors diagnosing diseases, it’s a game-changer. It means creating models which understand the full story, not just bits and pieces. In environmental monitoring, for example, being able to see both the broader changes over vast landscapes and the details of specific areas can help in understanding the bigger picture of climate impact. In healthcare, it could mean the difference between catching a disease early or not. We are not claiming to have solved all the world’s problems in one go. We are hoping that with $x$T we have opened the door to what’s possible. We’re stepping into a new era where we don’t have to compromise on the clarity or breadth of our vision. $x$T is our big leap towards models that can juggle the intricacies of large-scale images without breaking a sweat. There’s a lot more ground to cover. Research will evolve, and hopefully, so will our ability to process even bigger and more complex images. In fact, we are working on follow-ons to $x$T which will expand this frontier further. In Conclusion For a complete treatment of this work, please check out the paper on arXiv. The project page contains a link to our released code and weights. If you find the work useful, please cite it as below: @article{xTLargeImageModeling, title={xT: Nested Tokenization for Larger Context in Large Images}, author={Gupta, Ritwik and Li, Shufan and Zhu, Tyler and Malik, Jitendra and Darrell, Trevor and Mangalam, Karttikeya}, journal={arXiv preprint arXiv:2403.01915}, year={2024} }

  • 2024 BAIR Graduate Directory
    on March 11, 2024 at 9:00 am

    Every year, the Berkeley Artificial Intelligence Research (BAIR) Lab graduates some of the most talented and innovative minds in artificial intelligence and machine learning. Our Ph.D. graduates have each expanded the frontiers of AI research and are now ready to embark on new adventures in academia, industry, and beyond. These fantastic individuals bring with them a wealth of knowledge, fresh ideas, and a drive to continue contributing to the advancement of AI. Their work at BAIR, ranging from deep learning, robotics, and natural language processing to computer vision, security, and much more, has contributed significantly to their fields and has had transformative impacts on society. This website is dedicated to showcasing our colleagues, making it easier for academic institutions, research organizations, and industry leaders to discover and recruit from the newest generation of AI pioneers. Here, you’ll find detailed profiles, research interests, and contact information for each of our graduates. We invite you to explore the potential collaborations and opportunities these graduates present as they seek to apply their expertise and insights in new environments. Join us in celebrating the achievements of BAIR’s latest PhD graduates. Their journey is just beginning, and the future they will help build is bright! Thank you to our friends at the Stanford AI Lab for this idea! Abdus Salam Azad Email: salam_azad@berkeley.edu Website: https://www.azadsalam.org/ Advisor(s): Ion Stoica Research Blurb: My research interest lies broadly in the field of Machine Learning and Artificial Intelligence. During my PhD I have focused on Environment Generation/ Curriculum Learning methods for training Autonomous Agents with Reinforcement Learning. Specifically, I work on methods that algorithmically generates diverse training environments (i.e., learning scenarios) for autonomous agents to improve generalization and sample efficiency. Currently, I am working on Large Language Model (LLM) based autonomous agents. Jobs Interested In: Research Scientist, ML Engineer Alicia Tsai Email: aliciatsai@berkeley.edu Website: https://www.aliciatsai.com/ Advisor(s): Laurent El Ghaoui Research Blurb: My research delves into the theoretical aspects of deep implicit models, beginning with a unified “state-space” representation that simplifies notation. Additionally, my work explores various training challenges associated with deep learning, including problems amenable to convex and non-convex optimization. In addition to theoretical exploration, my research extends the potential applications to various problem domains, including natural language processing, and natural science. Jobs Interested In: Research Scientist, Applied Scientist, Machine Learning Engineer Catherine Weaver Email: catherine22@berkeley.edu Website: https://cwj22.github.io Advisor(s): Masayoshi Tomizuka, Wei Zhan Research Blurb: My research focuses on machine learning and control algorithms for the challenging task of autonomous racing in Gran Turismo Sport. I leverage my background in Mechanical Engineering to discover how machine learning and model-based optimal control can create safe, high-performance control systems for robotics and autonomous systems. A particular emphasis of mine has been how to leverage offline datasets (e.g. human player’s racing trajectories) to inform better, more sample efficient control algorithms. Jobs Interested In: Research Scientist and Robotics/Controls Engineer Chawin Sitawarin Email: chawin.sitawarin@gmail.com Website: https://chawins.github.io/ Advisor(s): David Wagner Research Blurb: I am broadly interested in the security and safety aspects of machine learning systems. Most of my previous works are in the domain of adversarial machine learning, particularly adversarial examples and robustness of machine learning algorithms. More recently, I am excited about emerging security and privacy risks on large language models. Jobs Interested In: Research scientist Dhruv Shah Email: shah@cs.berkeley.edu Website: http://cs.berkeley.edu/~shah/ Advisor(s): Sergey Levine Research Blurb: I train big(-ish) models and make robots smarter. Jobs Interested In: Research scientist, roboticist Eliza Kosoy Email: eko@berkeley.edu Website: https://www.elizakosoy.com/ Advisor(s): Alison Gopnik Research Blurb: Eliza Kosoy works at the intersection of child development and AI with Prof. Alison Gopnik. Her work includes creating evaluative benchmarks for LLMs rooted in child development and studying how children and adults use GenAI models such as ChatGPT/Dalle and form mental models about them. She’s an intern at Google working on the AI/UX team and previously with the Empathy Lab. She has published in Neurips, ICML, ICLR, Cogsci and cognition. Her thesis work created a unified virtual environment for testing children and AI models in one place for the purposes of training RL models. She also has experience building startups and STEM hardware coding toys. Jobs Interested In: Research Scientist (child development and AI), AI safety (specializing in children), User Experience (UX) Researcher (specializing in mixed methods, youth, AI, LLMs), Education and AI (STEM toys) Fangyu Wu Email: fangyuwu@berkeley.edu Website: https://fangyuwu.com/ Advisor(s): Alexandre Bayen Research Blurb: Under the mentorship of Prof. Alexandre Bayen, Fangyu focuses on the application of optimization methods to multi-agent robotic systems, particularly in the planning and control of automated vehicles. Jobs Interested In: Faculty, or research scientist in control, optimization, and robotics Frances Ding Email: frances@berkeley.edu Website: https://www.francesding.com/ Advisor(s): Jacob Steinhardt, Moritz Hardt Research Blurb: My research focus is in machine learning for protein modeling. I work on improving protein property classification and protein design, as well as understanding what different protein models learn. I have previously worked on sequence models for DNA and RNA, and benchmarks for evaluating the interpretability and fairness of ML models across domains. Jobs Interested In: Research scientist Jianlan Luo Email: jianlanluo@eecs.berkeley.edu Website: https://people.eecs.berkeley.edu/~jianlanluo/ Advisor(s): Sergey Levine Research Blurb: My research interests are broadly in scalable algorithms and practice of machine learning, robotics, and controls; particularly their intersections. Jobs Interested In: Faculty, Research Scientist Kathy Jang Email: kathyjang@gmail.com Website: https://kathyjang.com Advisor(s): Alexandre Bayen Research Blurb: My thesis work has specialized in reinforcement learning for autonomous vehicles, focusing on enhancing decision-making and efficiency in applied settings. In future work, I’m eager to apply these principles to broader challenges across domains like natural language processing. With my background, my aim is to see the direct impact of my efforts by contributing to innovative AI research and solutions. Jobs Interested In: ML research scientist/engineer Kevin Lin Email: k-lin@berkeley.edu Website: https://people.eecs.berkeley.edu/~kevinlin/ Advisor(s): Dan Klein, Joseph E. Gonzalez Research Blurb: My research focuses on understanding and improving how language models use and provide information. Jobs Interested In: Research Scientist Nikhil Ghosh Email: nikhil_ghosh@berkeley.edu Website: https://nikhil-ghosh-berkeley.github.io/ Advisor(s): Bin Yu, Song Mei Research Blurb: I am interested in developing a better foundational understanding of deep learning and improving practical systems, using both theoretical and empirical methodology. Currently, I am especially interested in improving the efficiency of large models by studying how to properly scale hyperparameters with model size. Jobs Interested In: Research Scientist Olivia Watkins Email: oliviawatkins@berkeley.edu Website: https://aliengirlliv.github.io/oliviawatkins Advisor(s): Pieter Abbeel and Trevor Darrell Research Blurb: My work involves RL, BC, learning from humans, and using common-sense foundation model reasoning for agent learning. I’m excited about language agent learning, supervision, alignment & robustness. Jobs Interested In: Research scientist Ruiming Cao Email: rcao@berkeley.edu Website: https://rmcao.net Advisor(s): Laura Waller Research Blurb: My research is on computational imaging, particularly the space-time modeling for dynamic scene recovery and motion estimation. I also work on optical microscopy techniques, optimization-based optical design, event camera processing, novel view rendering. Jobs Interested In: Research scientist, postdoc, faculty Ryan Hoque Email: ryanhoque@berkeley.edu Website: https://ryanhoque.github.io Advisor(s): Ken Goldberg Research Blurb: Imitation learning and reinforcement learning algorithms that scale to large robot fleets performing manipulation and other complex tasks. Jobs Interested In: Research Scientist Sam Toyer Email: sdt@berkeley.edu Website: https://www.qxcv.net/ Advisor(s): Stuart Russell Research Blurb: My research focuses on making language models secure, robust and safe. I also have experience in vision, planning, imitation learning, reinforcement learning, and reward learning. Jobs Interested In: Research scientist Shishir G. Patil Email: shishirpatil2007@gmail.com Website: https://shishirpatil.github.io/ Advisor(s): Joseph Gonzalez Research Blurb: Gorilla LLM – Teaching LLMs to use tools (https://gorilla.cs.berkeley.edu/); LLM Execution Engine: Guaranteeing reversibility, robustness, and minimizing blast-radius for LLM-Agents incorporated into user and enterprise workflows; POET: Memory bound, and energy efficient fine-tuning of LLMs on edge devices such as smartphones and laptops (https://poet.cs.berkeley.edu/). Jobs Interested In: Research Scientist Suzie Petryk Email: spetryk@berkeley.edu Website: https://suziepetryk.com/ Advisor(s): Trevor Darrell, Joseph Gonzalez Research Blurb: I work on improving the reliability and safety of multimodal models. My focus has been on localizing and reducing hallucinations for vision + language models, along with measuring and using uncertainty and mitigating bias. My interests lay in applying solutions to these challenges in actual production scenarios, rather than solely in academic environments. Jobs Interested In: Applied research scientist in generative AI, safety, and/or accessibility Xingyu Lin Email: xingyu@berkeley.edu Website: https://xingyu-lin.github.io/ Advisor(s): Pieter Abbeel Research Blurb: My research lies in robotics, machine learning, and computer vision, with the primary goal of learning generalizable robot skills from two angles: (1) Learning structured world models with spatial and temporal abstractions. (2) Pre-training visual representation and skills to enable knowledge transfer from Internet-scale vision datasets and simulators. Jobs Interested In: Faculty, or research scientist Yaodong Yu Email: yyu@eecs.berkeley.edu Website: https://yaodongyu.github.io/ Advisor(s): Michael I. Jordan, Yi Ma Research Blurb: My research interests are broadly in theory and practice of trustworthy machine learning, including interpretability, privacy, and robustness. Jobs Interested In: Faculty

  • The Shift from Models to Compound AI Systems
    on February 18, 2024 at 9:00 am

    AI caught everyone’s attention in 2023 with Large Language Models (LLMs) that can be instructed to perform general tasks, such as translation or coding, just by prompting. This naturally led to an intense focus on models as the primary ingredient in AI application development, with everyone wondering what capabilities new LLMs will bring. As more developers begin to build using LLMs, however, we believe that this focus is rapidly changing: state-of-the-art AI results are increasingly obtained by compound systems with multiple components, not just monolithic models. For example, Google’s AlphaCode 2 set state-of-the-art results in programming through a carefully engineered system that uses LLMs to generate up to 1 million possible solutions for a task and then filter down the set. AlphaGeometry, likewise, combines an LLM with a traditional symbolic solver to tackle olympiad problems. In enterprises, our colleagues at Databricks found that 60% of LLM applications use some form of retrieval-augmented generation (RAG), and 30% use multi-step chains. Even researchers working on traditional language model tasks, who used to report results from a single LLM call, are now reporting results from increasingly complex inference strategies: Microsoft wrote about a chaining strategy that exceeded GPT-4’s accuracy on medical exams by 9%, and Google’s Gemini launch post measured its MMLU benchmark results using a new CoT@32 inference strategy that calls the model 32 times, which raised questions about its comparison to just a single call to GPT-4. This shift to compound systems opens many interesting design questions, but it is also exciting, because it means leading AI results can be achieved through clever engineering, not just scaling up training. In this post, we analyze the trend toward compound AI systems and what it means for AI developers. Why are developers building compound systems? Is this paradigm here to stay as models improve? And what are the emerging tools for developing and optimizing such systems—an area that has received far less research than model training? We argue that compound AI systems will likely be the best way to maximize AI results in the future, and might be one of the most impactful trends in AI in 2024. Increasingly many new AI results are from compound systems. Why Use Compound AI Systems? We define a Compound AI System as a system that tackles AI tasks using multiple interacting components, including multiple calls to models, retrievers, or external tools. In contrast, an AI Model is simply a statistical model, e.g., a Transformer that predicts the next token in text. Even though AI models are continually getting better, and there is no clear end in sight to their scaling, more and more state-of-the-art results are obtained using compound systems. Why is that? We have seen several distinct reasons: Some tasks are easier to improve via system design. While LLMs appear to follow remarkable scaling laws that predictably yield better results with more compute, in many applications, scaling offers lower returns-vs-cost than building a compound system. For example, suppose that the current best LLM can solve coding contest problems 30% of the time, and tripling its training budget would increase this to 35%; this is still not reliable enough to win a coding contest! In contrast, engineering a system that samples from the model multiple times, tests each sample, etc. might increase performance to 80% with today’s models, as shown in work like AlphaCode. Even more importantly, iterating on a system design is often much faster than waiting for training runs. We believe that in any high-value application, developers will want to use every tool available to maximize AI quality, so they will use system ideas in addition to scaling. We frequently see this with LLM users, where a good LLM creates a compelling but frustratingly unreliable first demo, and engineering teams then go on to systematically raise quality. Systems can be dynamic. Machine learning models are inherently limited because they are trained on static datasets, so their “knowledge” is fixed. Therefore, developers need to combine models with other components, such as search and retrieval, to incorporate timely data. In addition, training lets a model “see” the whole training set, so more complex systems are needed to build AI applications with access controls (e.g., answer a user’s questions based only on files the user has access to). Improving control and trust is easier with systems. Neural network models alone are hard to control: while training will influence them, it is nearly impossible to guarantee that a model will avoid certain behaviors. Using an AI system instead of a model can help developers control behavior more tightly, e.g., by filtering model outputs. Likewise, even the best LLMs still hallucinate, but a system combining, say, LLMs with retrieval can increase user trust by providing citations or automatically verifying facts. Performance goals vary widely. Each AI model has a fixed quality level and cost, but applications often need to vary these parameters. In some applications, such as inline code suggestions, the best AI models are too expensive, so tools like Github Copilot use carefully tuned smaller models and various search heuristics to provide results. In other applications, even the largest models, like GPT-4, are too cheap! Many users would be willing to pay a few dollars for a correct legal opinion, instead of the few cents it takes to ask GPT-4, but a developer would need to design an AI system to utilize this larger budget. The shift to compound systems in Generative AI also matches the industry trends in other AI fields, such as self-driving cars: most of the state-of-the-art implementations are systems with multiple specialized components (more discussion here). For these reasons, we believe compound AI systems will remain a leading paradigm even as models improve. Developing Compound AI Systems While compound AI systems can offer clear benefits, the art of designing, optimizing, and operating them is still emerging. On the surface, an AI system is a combination of traditional software and AI models, but there are many interesting design questions. For example, should the overall “control logic” be written in traditional code (e.g., Python code that calls an LLM), or should it be driven by an AI model (e.g. LLM agents that call external tools)? Likewise, in a compound system, where should a developer invest resources—for example, in a RAG pipeline, is it better to spend more FLOPS on the retriever or the LLM, or even to call an LLM multiple times? Finally, how can we optimize an AI system with discrete components end-to-end to maximize a metric, the same way we can train a neural network? In this section, we detail a few example AI systems, then discuss these challenges and recent research on them. The AI System Design Space Below are few recent compound AI systems to show the breadth of design choices: AI System Components Design Results AlphaCode 2 Fine-tuned LLMs for sampling and scoring programs Code execution module Clustering model Generates up to 1 million solutions for a coding problem then filters and scores them Matches 85th percentile of humans on coding contests AlphaGeometry Fine-tuned LLM Symbolic math engine Iteratively suggests constructions in a geometry problem via LLM and checks deduced facts produced by symbolic engine Between silver and gold International Math Olympiad medalists on timed test Medprompt GPT-4 LLM Nearest-neighbor search in database of correct examples LLM-generated chain-of-thought examples Multiple samples and ensembling Answers medical questions by searching for similar examples to construct a few-shot prompt, adding model-generated chain-of-thought for each example, and generating and judging up to 11 solutions Outperforms specialized medical models like Med-PaLM used with simpler prompting strategies Gemini on MMLU Gemini LLM Custom inference logic Gemini’s CoT@32 inference strategy for the MMLU benchmark samples 32 chain-of-thought answers from the model, and returns the top choice if enough of them agree, or uses generation without chain-of-thought if not 90.04% on MMLU, compared to 86.4% for GPT-4 with 5-shot prompting or 83.7% for Gemini with 5-shot prompting ChatGPT Plus LLM Web Browser plugin for retrieving timely content Code Interpreter plugin for executing Python DALL-E image generator The ChatGPT Plus offering can call tools such as web browsing to answer questions; the LLM determines when and how to call each tool as it responds Popular consumer AI product with millions of paid subscribers RAG, ORQA, Bing, Baleen, etc LLM (sometimes called multiple times) Retrieval system Combine LLMs with retrieval systems in various ways, e.g., asking an LLM to generate a search query, or directly searching for the current context Widely used technique in search engines and enterprise apps Key Challenges in Compound AI Systems Compound AI systems pose new challenges in design, optimization and operation compared to AI models. Design Space The range of possible system designs for a given task is vast. For example, even in the simple case of retrieval-augmented generation (RAG) with a retriever and language model, there are: (i) many retrieval and language models to choose from, (ii) other techniques to improve retrieval quality, such as query expansion or reranking models, and (iii) techniques to improve the LLM’s generated output (e.g., running another LLM to check that the output relates to the retrieved passages). Developers have to explore this vast space to find a good design. In addition, developers need to allocate limited resources, like latency and cost budgets, among the system components. For example, if you want to answer RAG questions in 100 milliseconds, should you budget to spend 20 ms on the retriever and 80 on the LLM, or the other way around? Optimization Often in ML, maximizing the quality of a compound system requires co-optimizing the components to work well together. For example, consider a simple RAG application where an LLM sees a user question, generates a search query to send to a retriever, and then generates an answer. Ideally, the LLM would be tuned to generate queries that work well for that particular retriever, and the retriever would be tuned to prefer answers that work well for that LLM. In single model development a la PyTorch, users can easily optimize a model end-to-end because the whole model is differentiable. However, compound AI systems contain non-differentiable components like search engines or code interpreters, and thus require new methods of optimization. Optimizing these compound AI systems is still a new research area; for example, DSPy offers a general optimizer for pipelines of pretrained LLMs and other components, while others systems, like LaMDA, Toolformer and AlphaGeometry, use tool calls during model training to optimize models for those tools. Operation Machine learning operations (MLOps) become more challenging for compound AI systems. For example, while it is easy to track success rates for a traditional ML model like a spam classifier, how should developers track and debug the performance of an LLM agent for the same task, which might use a variable number of “reflection” steps or external API calls to classify a message? We believe that a new generation of MLOps tools will be developed to tackle these problems. Interesting problems include: Monitoring: How can developers most efficiently log, analyze, and debug traces from complex AI systems? DataOps: Because many AI systems involve data serving components like vector DBs, and their behavior depends on the quality of data served, any focus on operations for these systems should additionally span data pipelines. Security: Research has shown that compound AI systems, such as an LLM chatbot with a content filter, can create unforeseen security risks compared to individual models. New tools will be required to secure these systems. Emerging Paradigms To tackle the challenges of building compound AI systems, multiple new approaches are arising in the industry and in research. We highlight a few of the most widely used ones and examples from our research on tackling these challenges. Designing AI Systems: Composition Frameworks and Strategies. Many developers are now using “language model programming” frameworks that let them build applications out of multiple calls to AI models and other components. These include component libraries like LangChain and LlamaIndex that developers call from traditional programs, agent frameworks like AutoGPT and BabyAGI that let an LLM drive the application, and tools for controlling LM outputs, like Guardrails, Outlines, LMQL and SGLang. In parallel, researchers are developing numerous new inference strategies to generate better outputs using calls to models and tools, such as chain-of-thought, self-consistency, WikiChat, RAG and others. Automatically Optimizing Quality: DSPy. Coming from academia, DSPy is the first framework that aims to optimize a system composed of LLM calls and other tools to maximize a target metric. Users write an application out of calls to LLMs and other tools, and provide a target metric such as accuracy on a validation set, and then DSPy automatically tunes the pipeline by creating prompt instructions, few-shot examples, and other parameter choices for each module to maximize end-to-end performance. The effect is similar to end-to-end optimization of a multi-layer neural network in PyTorch, except that the modules in DSPy are not always differentiable layers. To do that, DSPy leverages the linguistic abilities of LLMs in a clean way: to specify each module, users write a natural language signature, such as user_question -> search_query, where the names of the input and output fields are meaningful, and DSPy automatically turns this into suitable prompts with instructions, few-shot examples, or even weight updates to the underlying language models. Optimizing Cost: FrugalGPT and AI Gateways. The wide range of AI models and services available makes it challenging to pick the right one for an application. Moreover, different models may perform better on different inputs. FrugalGPT is a framework to automatically route inputs to different AI model cascades to maximize quality subject to a target budget. Based on a small set of examples, it learns a routing strategy that can outperform the best LLM services by up to 4% at the same cost, or reduce cost by up to 90% while matching their quality. FrugalGPT is an example of a broader emerging concept of AI gateways or routers, implemented in software like Databricks AI Gateway, OpenRouter, and Martian, to optimize the performance of each component of an AI application. These systems work even better when an AI task is broken into smaller modular steps in a compound system, and the gateway can optimize routing separately for each step. Operation: LLMOps and DataOps. AI applications have always required careful monitoring of both model outputs and data pipelines to run reliably. With compound AI systems, however, the behavior of the system on each input can be considerably more complex, so it is important to track all the steps taken by the application and intermediate outputs. Software like LangSmith, Phoenix Traces, and Databricks Inference Tables can track, visualize and evaluate these outputs at a fine granularity, in some cases also correlating them with data pipeline quality and downstream metrics. In the research world, DSPy Assertions seeks to leverage feedback from monitoring checks directly in AI systems to improve outputs, and AI-based quality evaluation methods like MT-Bench, FAVA and ARES aim to automate quality monitoring. Conclusion Generative AI has excited every developer by unlocking a wide range of capabilities through natural language prompting. As developers aim to move beyond demos and maximize the quality of their AI applications, however, they are increasingly turning to compound AI systems as a natural way to control and enhance the capabilities of LLMs. Figuring out the best practices for developing compound AI systems is still an open question, but there are already exciting approaches to aid with design, end-to-end optimization, and operation. We believe that compound AI systems will remain the best way to maximize the quality and reliability of AI applications going forward, and may be one of the most important trends in AI in 2024. BibTex for this post: @misc{compound-ai-blog, title={The Shift from Models to Compound AI Systems}, author={Matei Zaharia and Omar Khattab and Lingjiao Chen and Jared Quincy Davis and Heather Miller and Chris Potts and James Zou and Michael Carbin and Jonathan Frankle and Naveen Rao and Ali Ghodsi}, howpublished={\url{https://bair.berkeley.edu/blog/2024/02/18/compound-ai-systems/}}, year={2024} }

  • Ghostbuster: Detecting Text Ghostwritten by Large Language Models
    on November 14, 2023 at 12:30 pm

    The structure of Ghostbuster, our new state-of-the-art method for detecting AI-generated text. Large language models like ChatGPT write impressively well—so well, in fact, that they’ve become a problem. Students have begun using these models to ghostwrite assignments, leading some schools to ban ChatGPT. In addition, these models are also prone to producing text with factual errors, so wary readers may want to know if generative AI tools have been used to ghostwrite news articles or other sources before trusting them. What can teachers and consumers do? Existing tools to detect AI-generated text sometimes do poorly on data that differs from what they were trained on. In addition, if these models falsely classify real human writing as AI-generated, they can jeopardize students whose genuine work is called into question. Our recent paper introduces Ghostbuster, a state-of-the-art method for detecting AI-generated text. Ghostbuster works by finding the probability of generating each token in a document under several weaker language models, then combining functions based on these probabilities as input to a final classifier. Ghostbuster doesn’t need to know what model was used to generate a document, nor the probability of generating the document under that specific model. This property makes Ghostbuster particularly useful for detecting text potentially generated by an unknown model or a black-box model, such as the popular commercial models ChatGPT and Claude, for which probabilities aren’t available. We’re particularly interested in ensuring that Ghostbuster generalizes well, so we evaluated across a range of ways that text could be generated, including different domains (using newly collected datasets of essays, news, and stories), language models, or prompts. Examples of human-authored and AI-generated text from our datasets. Why this Approach? Many current AI-generated text detection systems are brittle to classifying different types of text (e.g., different writing styles, or different text generation models or prompts). Simpler models that use perplexity alone typically can’t capture more complex features and do especially poorly on new writing domains. In fact, we found that a perplexity-only baseline was worse than random on some domains, including non-native English speaker data. Meanwhile, classifiers based on large language models like RoBERTa easily capture complex features, but overfit to the training data and generalize poorly: we found that a RoBERTa baseline had catastrophic worst-case generalization performance, sometimes even worse than a perplexity-only baseline. Zero-shot methods that classify text without training on labeled data, by calculating the probability that the text was generated by a specific model, also tend to do poorly when a different model was actually used to generate the text. How Ghostbuster Works Ghostbuster uses a three-stage training process: computing probabilities, selecting features, and classifier training. Computing probabilities: We converted each document into a series of vectors by computing the probability of generating each word in the document under a series of weaker language models (a unigram model, a trigram model, and two non-instruction-tuned GPT-3 models, ada and davinci). Selecting features: We used a structured search procedure to select features, which works by (1) defining a set of vector and scalar operations that combine the probabilities, and (2) searching for useful combinations of these operations using forward feature selection, repeatedly adding the best remaining feature. Classifier training: We trained a linear classifier on the best probability-based features and some additional manually-selected features. Results When trained and tested on the same domain, Ghostbuster achieved 99.0 F1 across all three datasets, outperforming GPTZero by a margin of 5.9 F1 and DetectGPT by 41.6 F1. Out of domain, Ghostbuster achieved 97.0 F1 averaged across all conditions, outperforming DetectGPT by 39.6 F1 and GPTZero by 7.5 F1. Our RoBERTa baseline achieved 98.1 F1 when evaluated in-domain on all datasets, but its generalization performance was inconsistent. Ghostbuster outperformed the RoBERTa baseline on all domains except creative writing out-of-domain, and had much better out-of-domain performance than RoBERTa on average (13.8 F1 margin). Results on Ghostbuster’s in-domain and out-of-domain performance. To ensure that Ghostbuster is robust to the range of ways that a user might prompt a model, such as requesting different writing styles or reading levels, we evaluated Ghostbuster’s robustness to several prompt variants. Ghostbuster outperformed all other tested approaches on these prompt variants with 99.5 F1. To test generalization across models, we evaluated performance on text generated by Claude, where Ghostbuster also outperformed all other tested approaches with 92.2 F1. AI-generated text detectors have been fooled by lightly editing the generated text. We examined Ghostbuster’s robustness to edits, such as swapping sentences or paragraphs, reordering characters, or replacing words with synonyms. Most changes at the sentence or paragraph level didn’t significantly affect performance, though performance decreased smoothly if the text was edited through repeated paraphrasing, using commercial detection evaders such as Undetectable AI, or making numerous word- or character-level changes. Performance was also best on longer documents. Since AI-generated text detectors may misclassify non-native English speakers’ text as AI-generated, we evaluated Ghostbuster’s performance on non-native English speakers’ writing. All tested models had over 95% accuracy on two of three tested datasets, but did worse on the third set of shorter essays. However, document length may be the main factor here, since Ghostbuster does nearly as well on these documents (74.7 F1) as it does on other out-of-domain documents of similar length (75.6 to 93.1 F1). Users who wish to apply Ghostbuster to real-world cases of potential off-limits usage of text generation (e.g., ChatGPT-written student essays) should note that errors are more likely for shorter text, domains far from those Ghostbuster trained on (e.g., different varieties of English), text by non-native speakers of English, human-edited model generations, or text generated by prompting an AI model to modify a human-authored input. To avoid perpetuating algorithmic harms, we strongly discourage automatically penalizing alleged usage of text generation without human supervision. Instead, we recommend cautious, human-in-the-loop use of Ghostbuster if classifying someone’s writing as AI-generated could harm them. Ghostbuster can also help with a variety of lower-risk applications, including filtering AI-generated text out of language model training data and checking if online sources of information are AI-generated. Conclusion Ghostbuster is a state-of-the-art AI-generated text detection model, with 99.0 F1 performance across tested domains, representing substantial progress over existing models. It generalizes well to different domains, prompts, and models, and it’s well-suited to identifying text from black-box or unknown models because it doesn’t require access to probabilities from the specific model used to generate the document. Future directions for Ghostbuster include providing explanations for model decisions and improving robustness to attacks that specifically try to fool detectors. AI-generated text detection approaches can also be used alongside alternatives such as watermarking. We also hope that Ghostbuster can help across a variety of applications, such as filtering language model training data or flagging AI-generated content on the web. Try Ghostbuster here: ghostbuster.app Learn more about Ghostbuster here: [ paper ] [ code ] Try guessing if text is AI-generated yourself here: ghostbuster.app/experiment

  • Asymmetric Certified Robustness via Feature-Convex Neural Networks
    on November 14, 2023 at 9:00 am

    Asymmetric Certified Robustness via Feature-Convex Neural Networks TLDR: We propose the asymmetric certified robustness problem, which requires certified robustness for only one class and reflects real-world adversarial scenarios. This focused setting allows us to introduce feature-convex classifiers, which produce closed-form and deterministic certified radii on the order of milliseconds. Figure 1. Illustration of feature-convex classifiers and their certification for sensitive-class inputs. This architecture composes a Lipschitz-continuous feature map $\varphi$ with a learned convex function $g$. Since $g$ is convex, it is globally underapproximated by its tangent plane at $\varphi(x)$, yielding certified norm balls in the feature space. Lipschitzness of $\varphi$ then yields appropriately scaled certificates in the original input space. Despite their widespread usage, deep learning classifiers are acutely vulnerable to adversarial examples: small, human-imperceptible image perturbations that fool machine learning models into misclassifying the modified input. This weakness severely undermines the reliability of safety-critical processes that incorporate machine learning. Many empirical defenses against adversarial perturbations have been proposed—often only to be later defeated by stronger attack strategies. We therefore focus on certifiably robust classifiers, which provide a mathematical guarantee that their prediction will remain constant for an $\ell_p$-norm ball around an input. Conventional certified robustness methods incur a range of drawbacks, including nondeterminism, slow execution, poor scaling, and certification against only one attack norm. We argue that these issues can be addressed by refining the certified robustness problem to be more aligned with practical adversarial settings. The Asymmetric Certified Robustness Problem Current certifiably robust classifiers produce certificates for inputs belonging to any class. For many real-world adversarial applications, this is unnecessarily broad. Consider the illustrative case of someone composing a phishing scam email while trying to avoid spam filters. This adversary will always attempt to fool the spam filter into thinking that their spam email is benign—never conversely. In other words, the attacker is solely attempting to induce false negatives from the classifier. Similar settings include malware detection, fake news flagging, social media bot detection, medical insurance claims filtering, financial fraud detection, phishing website detection, and many more. Figure 2. Asymmetric robustness in email filtering. Practical adversarial settings often require certified robustness for only one class. These applications all involve a binary classification setting with one sensitive class that an adversary is attempting to avoid (e.g., the “spam email” class). This motivates the problem of asymmetric certified robustness, which aims to provide certifiably robust predictions for inputs in the sensitive class while maintaining a high clean accuracy for all other inputs. We provide a more formal problem statement in the main text. Feature-convex classifiers We propose feature-convex neural networks to address the asymmetric robustness problem. This architecture composes a simple Lipschitz-continuous feature map ${\varphi: \mathbb{R}^d \to \mathbb{R}^q}$ with a learned Input-Convex Neural Network (ICNN) ${g: \mathbb{R}^q \to \mathbb{R}}$ (Figure 1). ICNNs enforce convexity from the input to the output logit by composing ReLU nonlinearities with nonnegative weight matrices. Since a binary ICNN decision region consists of a convex set and its complement, we add the precomposed feature map $\varphi$ to permit nonconvex decision regions. Feature-convex classifiers enable the fast computation of sensitive-class certified radii for all $\ell_p$-norms. Using the fact that convex functions are globally underapproximated by any tangent plane, we can obtain a certified radius in the intermediate feature space. This radius is then propagated to the input space by Lipschitzness. The asymmetric setting here is critical, as this architecture only produces certificates for the positive-logit class $g(\varphi(x)) > 0$. The resulting $\ell_p$-norm certified radius formula is particularly elegant: \[r_p(x) = \frac{ \color{blue}{g(\varphi(x))} } { \mathrm{Lip}_p(\varphi) \color{red}{\| \nabla g(\varphi(x)) \| _{p,*}}}.\] The non-constant terms are easily interpretable: the radius scales proportionally to the classifier confidence and inversely to the classifier sensitivity. We evaluate these certificates across a range of datasets, achieving competitive $\ell_1$ certificates and comparable $\ell_2$ and $\ell_{\infty}$ certificates—despite other methods generally tailoring for a specific norm and requiring orders of magnitude more runtime. Figure 3. Sensitive class certified radii on the CIFAR-10 cats vs dogs dataset for the $\ell_1$-norm. Runtimes on the right are averaged over $\ell_1$, $\ell_2$, and $\ell_{\infty}$-radii (note the log scaling). Our certificates hold for any $\ell_p$-norm and are closed form and deterministic, requiring just one forwards and backwards pass per input. These are computable on the order of milliseconds and scale well with network size. For comparison, current state-of-the-art methods such as randomized smoothing and interval bound propagation typically take several seconds to certify even small networks. Randomized smoothing methods are also inherently nondeterministic, with certificates that just hold with high probability. Theoretical promise While initial results are promising, our theoretical work suggests that there is significant untapped potential in ICNNs, even without a feature map. Despite binary ICNNs being restricted to learning convex decision regions, we prove that there exists an ICNN that achieves perfect training accuracy on the CIFAR-10 cats-vs-dogs dataset. Fact. There exists an input-convex classifier which achieves perfect training accuracy for the CIFAR-10 cats-versus-dogs dataset. However, our architecture achieves just $73.4\%$ training accuracy without a feature map. While training performance does not imply test set generalization, this result suggests that ICNNs are at least theoretically capable of attaining the modern machine learning paradigm of overfitting to the training dataset. We thus pose the following open problem for the field. Open problem. Learn an input-convex classifier which achieves perfect training accuracy for the CIFAR-10 cats-versus-dogs dataset. Conclusion We hope that the asymmetric robustness framework will inspire novel architectures which are certifiable in this more focused setting. Our feature-convex classifier is one such architecture and provides fast, deterministic certified radii for any $\ell_p$-norm. We also pose the open problem of overfitting the CIFAR-10 cats vs dogs training dataset with an ICNN, which we show is theoretically possible. This post is based on the following paper: Asymmetric Certified Robustness via Feature-Convex Neural Networks Samuel Pfrommer, Brendon G. Anderson, Julien Piet, Somayeh Sojoudi, 37th Conference on Neural Information Processing Systems (NeurIPS 2023). Further details are available on arXiv and GitHub. If our paper inspires your work, please consider citing it with: @inproceedings{ pfrommer2023asymmetric, title={Asymmetric Certified Robustness via Feature-Convex Neural Networks}, author={Samuel Pfrommer and Brendon G. Anderson and Julien Piet and Somayeh Sojoudi}, booktitle={Thirty-seventh Conference on Neural Information Processing Systems}, year={2023} }


Econbrowser Analysis of current economic conditions and policy

  • Fed Board: “Why is the U.S. GDP recovering faster than other advanced economies?”
    by Menzie Chinn on May 19, 2024 at 7:23 pm

    An extensive Board article released on Friday: Figure 1 displays the fact that the US has reverted to pre-Covid trend, while other economies have not (pity poor UK, buffeted by Covid and Brexit). From the Conclusion: Our analysis points to growth divergence between the U.S. and AFEs being the result of a variety of factors,

  • Eric Hovde Predicts
    by Menzie Chinn on May 19, 2024 at 5:55 am

    recession, stock market decline, and housing market decline. From December 19th (Newport Beach Independent): Economic Slowdown: The U.S. is likely to enter a recession, with consumers expected to deplete their savings, leading to only one potentially positive GDP quarter in 2024. Corporate Downsizing and Unemployment: Anticipated downsizing in corporations may push unemployment rates up, though

  • One Year Ahead Inflation Expectations
    by Menzie Chinn on May 19, 2024 at 3:31 am

    April and May inflation was overpredicted by year-ahead consumer-based surveys. Figure 1: CPI inflation year-on-year (black), median expected from Survey of Professional Forecasters (blue +), median expected from Michigan Survey of Consumers (red), median from NY Fed Survey of Consumer Expectations (light green), unit costs (chartreuse), all in %. May 2024 Michigan observation is preliminary.

  • CNY Overtakes CAD in FX Trading, CB Reserve Holdings in 2022
    by Menzie Chinn on May 18, 2024 at 10:30 pm

    Talking about the dollar as an reserve currency next week [2], and noticed these interesting trends. Figure 1: Share of FX turnover in CNY (red square), in CAD (chartreuse triangle), in April. Normalized shares to 1.00. Source: BIS Triennial Surveys. A similar pattern holds for central bank reserve holdings as reported in the IMF’s COFER.

  • Private Nonfinancial Corporate Debt-Service-Ratio
    by Menzie Chinn on May 18, 2024 at 6:06 pm

    In yesterday’s post, I noted a recession forecast based on a probit specification incorporating a debt-service-ratio yielded a substantially lower probability for 2024M05 than a plain vanilla specification. Part of why this is true is that the debt-service ratio is fairly low, despite high Treasury yields. Figure 1: Debt-service ratio for private nonfinancial sectorcorporations, %


Conversable Economist In Hume’s spirit, I will attempt to serve as an ambassador from my world of economics, and help in “finding topics of conversation fit for the entertainment of rational creatures.”

  • Pushback on Pessimism About Randomized Controlled Trials
    by conversableeconomist on May 17, 2024 at 6:39 pm

    Back in January, I posted about an article that was getting some attention in my world. Megan T. Stevenson is an active researcher in the criminal-justice-and-economics literature. She argues that when you look at the published studies that use randomized control trial methods to evaluate ways of reducing crime, most of the studies don’t show a … Continue reading Pushback on Pessimism About Randomized Controlled Trials The post Pushback on Pessimism About Randomized Controlled Trials first appeared on Conversable Economist.

  • Uber/Lyft vs. the Minneapolis City Council
    by conversableeconomist on May 15, 2024 at 2:00 pm

    The Minneapolis City Council voted back on March 7 to require that ride-sharing firms like Uber and Lyft needed to increase the pay received by their drivers. Uber and Lyft both responded by saying that they would stop travelling to or from locations in the city of Minneapolis; Uber said that it would leave the … Continue reading Uber/Lyft vs. the Minneapolis City Council The post Uber/Lyft vs. the Minneapolis City Council first appeared on Conversable Economist.

  • What Economic Research do Policymakers Want?
    by conversableeconomist on May 9, 2024 at 8:30 pm

    The obvious answer is that policymakers want research that supports their personal and political preferences. Conversely, policymakers don’t want research that might pressure them to change their views. But with that central truth duly noted, situations often arise where policymakers have an overall goal, but the details of how to achieve that goal, or how … Continue reading What Economic Research do Policymakers Want? The post What Economic Research do Policymakers Want? first appeared on Conversable Economist.

  • A Primer on Federal Home Loan Bank System
    by conversableeconomist on May 8, 2024 at 6:54 pm

    There are three “government-sponsored enterprises,” commonly called GSEs, that play a big role in US housing finance: the Federal Home Loan Banks, Fannie Mae, and Freddie Mac. Perhaps the key similarity across all three is that when they borrow money, the financial markets perceive that the federal government is standing behind the loan–and so they … Continue reading A Primer on Federal Home Loan Bank System The post A Primer on Federal Home Loan Bank System first appeared on Conversable Economist.

  • A Downside of the 15-Minute City
    by conversableeconomist on May 7, 2024 at 3:00 pm

    The “15-minute city” is getting some attention from urban planners. The idea is that everyone should be able to access the key destinations in their day-to-day life–work, food, schools, recreation–within a 15-minute walk, bike ride, or mass transit ride of their residence. Cars would then be unnecessary for many daily tasks. Most Americans do not … Continue reading A Downside of the 15-Minute City The post A Downside of the 15-Minute City first appeared on Conversable Economist.


UN News – Global perspective Human stories UN News – Global perspective Human stories | Culture and Education


  • Syria: WHO Regional Director calls for greater investment in health sector
    by Global Issues on May 19, 2024 at 12:00 pm

    Failure to invest in the health of the Syrian people will only deepen instability in the war-ravaged country and pose threats to regional and global security, a senior official with the World Health Organization (WHO) has said. Read the full story, “Syria: WHO Regional Director calls for greater investment in health sector”, on globalissues.org →

  • Gaza: Nearly 800,000 now displaced from Rafah
    by Global Issues on May 18, 2024 at 12:00 pm

    Roughly 800,000 people have been forced to flee Rafah since Israel launched a military operation in the area on 6 May, the head of UN Palestine refugee agency UNRWA said on Saturday in a renewed appeal for greater protection of civilians in Gaza, safe humanitarian access and, ultimately, a ceasefire. Read the full story, “Gaza: Nearly 800,000 now displaced from Rafah”, on globalissues.org →

  • Rising Temperatures Drive Human-Wildlife Conflict in Zimbabwe
    by Global Issues on May 17, 2024 at 3:05 pm

    BULAWAYO, Zimbabwe, May 17 (IPS) – Rising temperatures are being blamed for an increase in human-wildlife conflicts in Zimbabwe as animals such as snakes leave their natural habitat earlier than usual.Read the full story, “Rising Temperatures Drive Human-Wildlife Conflict in Zimbabwe”, on globalissues.org →

  • Women Organize to Fight Coastal Erosion in Southeastern Brazil
    by Global Issues on May 17, 2024 at 2:13 pm

    ATAFONA, Brazil, May 17 (IPS) – Sonia Ferreira watched as the sea toppled buildings all around her for years. Finally, the impact of the rise in sea levels wrecked her home in 2019. Fishermen find their access to a fishing port limited, affecting their livelihoods. The residents of the coastal town of Atafona in southeastern Brazil count their losses to rising sea levels and climate change.Read the full story, “Women Organize to Fight Coastal Erosion in Southeastern Brazil”, on globalissues.org →

  • More Diversified Trade Can Make Middle East & Central Asia More Resilient
    by Global Issues on May 17, 2024 at 1:13 pm

    WASHINGTON DC, May 17 (IPS) – Dislocations from the pandemic, geoeconomic fragmentation, and Russia’s war in Ukraine have shifted world trade dynamics. While this has created challenges, the redirection of trade has also generated new opportunities, particularly for the Caucasus and Central Asia.Read the full story, “More Diversified Trade Can Make Middle East & Central Asia More Resilient”, on globalissues.org →

  • Crimes against nature: UN agency puts environmental legislation under scrutiny
    by Global Issues on May 17, 2024 at 12:00 pm

    Global efforts to prevent crimes against nature and bring offenders to justice are being hampered by glaring differences in environmental protection laws among countries and regions, UN crime prevention experts said on Friday.Read the full story, “Crimes against nature: UN agency puts environmental legislation under scrutiny”, on globalissues.org →

  • Gaza: Aid delivery via floating dock welcomed, but land routes ‘more important’
    by Global Issues on May 17, 2024 at 12:00 pm

    Trucks carrying desperately needed aid into Gaza have started moving ashore on the temporary floating dock built by the United States military, but this is not enough to meet the needs of civilians, UN humanitarian affairs office, OCHA, said on Friday. Read the full story, “Gaza: Aid delivery via floating dock welcomed, but land routes ‘more important’”, on globalissues.org →

  • Israel refutes South Africa’s accusations at UN world court
    by Global Issues on May 17, 2024 at 12:00 pm

    The UN International Court of Justice (ICJ), on Friday, heard the response from Israel on the case brought forward by South Africa requesting emergency provisional measures to immediately halt Israeli military operations under way in Rafah, in southern Gaza, where over one million Palestinians were sheltering after having been displaced from elsewhere in the enclave.Read the full story, “Israel refutes South Africa’s accusations at UN world court”, on globalissues.org →

  • UN rights office urges Sri Lanka to reveal fate of the disappeared
    by Global Issues on May 17, 2024 at 12:00 pm

    The UN human rights office, OHCHR, on Friday urged the Sri Lankan Government to take decisive action to uncover the fates and locations of tens of thousands of individuals subjected to enforced disappearances over the years and to hold those responsible accountable.Read the full story, “UN rights office urges Sri Lanka to reveal fate of the disappeared”, on globalissues.org →

  • India’s LGBTQIA+ community notches legal wins but still faces societal hurdles to acceptance, equal rights
    by Global Issues on May 17, 2024 at 12:00 pm

    While there has been some recent progress for India’s LGBTQIA+ community, there is still a long way to go to overcome social stigma and prejudice, and to ensure that all people in the country feel their rights are protected, regardless of gender identity or sexual orientation.Read the full story, “India’s LGBTQIA+ community notches legal wins but still faces societal hurdles to acceptance, equal rights”, on globalissues.org →


    Feed has no items.

Defector The last good website.

  • For Lando Norris, One Win Is No Longer Enough
    by Kathryn Xu on May 19, 2024 at 5:19 pm

    When asked last week about if he would rather win by 20 seconds or in a close wheel-to-wheel race, Max Verstappen answered, with expected honesty, “At least 20 seconds!” For a team as dominant as Red Bull is, that does make sense: If Verstappen isn’t winning by a large margin, then something must be going wrong. By that definition, the past two races have been a substantial failure: First, Lando Norris and McLaren’s shiny new upgrade package earned a win in Miami, and this very Sunday, Norris and McLaren were less than a second off from a back-to-back win at the Emilia Romagna Grand Prix in Imola. You can often trace how hard a driver is pushing to win by how toxic they’re getting over the radio; by that indication, both Norris and Verstappen were in fine form this weekend. At the start of the race, when almost every driver was on medium tires and Verstappen had sailed off to a comfortable lead, it looked as though McLaren would be competing with Ferrari for the lower podium spots. Norris’s race engineer, Will Joseph, informed Norris that they were using the tires less than everyone else. In turn, Norris informed Joseph that, well, he didn’t have any pace. The subsequent time Norris’s radio was revealed on the F1 TV broadcast, Norris first hit Joseph’s attempted update with a schoolteacher-esque response of, “Speak up.”

  • Now Let Us Praise Shota Imanaga
    by Kathryn Xu on May 19, 2024 at 3:52 pm

    Jeff Passan declared it, and so it must be so: It’s time to appreciate Shota Imanaga. As everyone anticipated before this season, the mid-May series between the Chicago Cubs and our Pittsburgh Pirates has gifted us some of the most exciting starting pitching so far this season. There’s Paul Skenes, the 21-year-old rookie for the Pirates whose stuff is so obviously nasty that he makes everyone fear for the state of his UCL whenever he pitches, and then there’s Imanaga, the 30-year-old rookie pitcher for the Cubs, who is in some ways the exact inverse of Skenes: a 5-foot-10 lefty throwing 92 mile-per-hour fastballs from a low arm slot. Imanaga knows how to play to a crowd. In his opening press conference, he prepared an English statement for the benefit of the fans. “Hey Chicago,” he said, and then waited out the laughter and applause. “What do you say? Cubs are gonna win today.” On Saturday, Imanaga did his part in blanking the Pirates over seven innings, lowering his ERA from a league-leading 0.96 down to a league-leading 0.84. For all that people see “Cubs pitcher with a low-velo fastball” and by heuristic default to “Now there’s a guy who simply gets muddy in all of his starts,” Imanaga is absolutely clean with it.

  • Thunder–Mavericks Deserved A Better Ending
    by Albert Burneko on May 19, 2024 at 2:37 pm

    The closing stretch of Game 6 between the Oklahoma City Thunder and the Dallas Mavericks featured some spectacular shooting and play-making. An off-balance 26-footer from Kyrie Irving that took a high bounce off the rim before dropping through to give Dallas a one-point lead with around three minutes to play; Shai Gilgeous-Alexander erasing that lead with what felt like his 50th smooth step-back jumper of the night; Derrick Jones Jr. recovering a tipped Irving shot and tossing in a parabolic fade-away jumper over OKC’s Chet Holmgren to give the Mavs, who’d been in deep shit at halftime and trailed pretty much all game, a five-point lead with just over a minute left to play. Gilgeous-Alexander, impossibly cool, immediately sauntering into a pull-up three from the top of the key with 18 seconds left on the shot clock, and swishing it to cut the lead back to two. https://www.youtube.com/watch?v=7ZnawVCrxAM&ab_channel=NBA

  • Harrison Butker Overestimates His Range
    by David Roth on May 17, 2024 at 9:11 pm

    As a general rule, something has to go extremely wrong for an NFL kicker to drive a week of discourse. When Steelers kicker Jeff Reed trashed a Sheetz bathroom and berated its graveyard-shift employees back in 2009—”caused damage to a towel dispenser as he was infuriated at the fact that there were no towels in it,” in the words of the local police’s statement—it did not occasion any somber statements from the Steelers or exhaustingly snarky japes from Wawa. It did not spark a national conversation about the crisis in the Kicker-American Community. While there is a through-line between Reed’s toilet meltdown and Harrison Butker’s disastrous commencement address at Benedictine College—a kicker marooned in a non-kicking situation, embarrassing himself and inconveniencing others—it has not disappeared quite as quickly as Reed’s incident did. Not yet, anyway. Before it can go away entirely, and before Butker can return to Being Unapologetic In His Masculinity in a corner of the locker room that people take great care to avoid, everyone needs to get on the record. The NFL, in response to the kicker’s broadsides against diversity, reiterated its institutional dedication to inclusion. GLAAD pointed out that it was a strange choice on Butker’s part to celebrate the proud graduates by treating them to a tight 20 minutes of free-associative Trad Cath boilerplate. The Benedictine Sisters of Mount St. Scholastica, the co-founding institution of Benedictine College, put a big statement on the front page of their website saying that they “do not believe that Harrison Butker’s comments in his 2024 Benedictine College commencement address represent the Catholic, Benedictine, liberal arts college that our founders envisioned and in which we have been so invested.” Kansas City’s official Twitter account made a point of noting that Butker does not actually live in Kansas City.

  • Absolute Skenes
    by Chris Thompson on May 17, 2024 at 8:56 pm

    Rocket-armed Pirates rookie pitcher Paul Skenes made his first major-league start Saturday, against the Cubs, in Pittsburgh. It was fine. Skenes struck out seven Cubs in four innings, which is neat, but he also walked two, hit a guy, and allowed six hits, including a double and a sockdolager, and the visitors pushed across three runs before Skenes was pulled on 84 pitches. There was plenty to like—Skenes threw a whopping 17 100-mph pitches—but it was not the big Stephen Strasburg-esque dominating performance fans might’ve wanted. It was fine. The Pirates later tried to lose the game but could not, although Skenes did not collect the win. Skenes took the mound again Friday afternoon for his second start, also against the Cubs, this time on the road. This time the big righty was out for blood, and mowed through Chicago’s lineup with jaw-dropping stuff. Each of the first seven Cubs hitters to come to the plate went down on strikes. Skenes did not allow a base runner until the fifth inning; when he was pulled after the sixth inning, the Cubs still had not registered a single hit, and he’d struck out 11. Baseball knowers were telling the truth: This guy can huck the absolute bejeezus out of that dang baseball.