This AI Paper Shows How ChatGPT’s Toxicity Can Increase Up To Six-Fold When Assigned A Persona

With recent technological advancements, large language models (LLMs) like GPT-3 and PaLM have exhibited remarkable generation capabilities across a wide range of domains such as education, content creation, healthcare, research, etc. For instance, these large language models are especially useful to writers to help them enhance their writing style and to budding developers in assisting them to generate boilerplate code, etc. Moreover, combined with the availability of several third-party APIs, the widespread adoption of LLMs has only increased across several consumer-facing systems, such as by students and healthcare systems used by hospitals. However, in such scenarios, the safety of these systems becomes a fundamental issue as people trust these systems with sensitive personal information. This calls for a need to get a more clear picture of the different capabilities and limitations of LLMs. 

However, most previous research has focused on making LLMs more powerful by employing more advanced and sophisticated architectures. Although this research has significantly transcended the NLP community, it has also resulted in sidelining the safety of these systems. On this front, a team of postdoctoral students from Princeton University and Georgia Tech collaborated with researchers from the Allen Institute for AI (A2I) to bridge this gap by performing a  toxicity analysis of OpenAI’s revolutionary AI chatbot, ChatGPT. The researchers evaluated toxicity in over half a million generations of ChatGPT, and their investigations revealed that when the system parameter of ChatGPT was set such that it was assigned a persona, its toxicity increased multifold for a wide range of topics. For example, when ChatGPT’s persona is set to that of the boxer “Muhammad Ali,” its toxicity increases almost 3-fold compared to its default settings. This is particularly alarming as ChatGPT is currently being used as a foundation to build several other technologies which can then generate the same level of toxicity with such system-level modifications. Thus, the work done by A2I researchers and university students focuses on gaining a deeper insight into this toxicity in ChatGPT’s generations when it is assigned different personas.

The ChatGPT API provides a feature that allows the user to assign a persona by setting its system parameter such that the persona sets the tone for the rest of the conversation by influencing the way ChatGPT converses. For their use case, the researchers curated a list of 90 personas from different backgrounds and countries, like entrepreneurs, politicians, journalists, etc. These personas were assigned to ChatGPT to analyze its responses over approximately 128 critical entities such as gender, religion, profession, etc. The team also asked ChatGPT to continue certain incomplete phrases on these entities to gather more insights. The final findings showed that assigning ChatGPT a persona can enhance its toxicity by up to six times, with ChatGPT frequently producing harsh outputs and indulging in negative stereotypes and beliefs. 
🚀 Check Out 100’s AI Tools in AI Tools Club

The team’s research showed that the toxicity of the outputs varied significantly depending on the persona that ChatGPT was given, which the researchers theorize is because of ChatGPT’s comprehension of the person based on its training data. One finding, for instance, suggested that journalists are twice as toxic as businesspeople, even if this may not necessarily be the case in practice. The study also showed that specific populations and entities are targeted more frequently (nearly three times more) than others, demonstrating the model’s inherently discriminating behavior. For instance, toxicity varies depending on a person’s gender and is roughly 50% higher than toxicity based on race. These fluctuation tendencies could be damaging to users and derogatory to the individual in question. Moreover, malicious users can build technologies on ChatGPT to generate content that might harm an unsuspecting audience.

This study’s analysis of ChatGPT’s toxicity mainly revealed three things: the model can be significantly more toxic when personas are assigned (up to six times more toxic than default), the toxicity of the model varies greatly depending on the persona’s identity, with ChatGPT’s opinion about the persona playing a significant role; and ChatGPT can discriminatorily target specific entities by being more toxic while creating content about them. The researchers also noted that, even though ChatGPT was the LLM they utilized for their experiment, their methodology could be extended to any other LLM. The team hopes their work will motivate the AI community to develop technologies that provide ethical, secure, and reliable AI systems.

Check out the Paper and Reference Article. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 18k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

🚀 Check Out 100’s AI Tools in AI Tools Club

This AI Paper Shows How ChatGPT’s Toxicity Can Increase Up To Six-Fold When Assigned A Persona Read More »

Meet Alibaba’s ChatGPT Competitor Tongyi Qianwen: a Large Language Model that will be Embedded in its Tmall Genie Smart Speakers and Workplace Messaging Platform DingTalk

Artificial intelligence has been rapidly growing in popularity and importance in the tech industry over the past few years, with companies investing heavily in AI research and development. One particular area where AI is making waves is in the creation of AI chatbots, which are designed to assist with a variety of tasks ranging from customer service to office management. Recently, Chinese cloud companies such as Alibaba Cloud have been stepping up their efforts to launch AI chatbots in order to compete with American counterparts such as OpenAI, Microsoft, and Google.

One such AI chatbot is Alibaba Cloud’s Tongyi Qianwen, which claims to be “a large model that specifically responds to human instructions, an efficiency assistant, and an idea generator.” Tongyi Qianwen is designed to assist with various tasks, including making summaries for meetings or documents, drafting copywriting, generating pictures, planning itineraries, providing shopping suggestions, and even generating apps from functional sketches uploaded by users. The AI chatbot is intended to be an all-in-one solution for businesses and individuals looking to streamline their operations and increase productivity.

Alibaba Cloud CEO Zhang Yong stated at a recent developer conference that in the future, all products of Alibaba will be imported into the model of Tongyi Qianwen for transformation. Currently, Alibaba’s enterprise instant messaging software DingTalk and smart speaker Tmall Genie are testing services that integrate their models. For example, they can make key summaries of online meetings or chat messages and generate pending issues, and then send them to groups. This integration allows businesses to benefit from the efficiency and convenience of Tongyi Qianwen in their daily operations.
🚀 Check Out 100’s AI Tools in AI Tools Club

Alibaba Cloud Intelligence has also announced plans to open the basic model of Tongyi Qianwen in the future, allowing enterprises to create their own large-scale language models and services. The business model involves developers or enterprises purchasing Alibaba Cloud’s computing and software platform services to develop, train, or deploy their AI applications. This move will further increase the adoption of AI chatbots in the business world and provide more opportunities for businesses to customize and optimize their AI chatbot solutions.

Tongyi Qianwen faces stiff competition from other Chinese cloud companies such as Baidu’s Wenxin Yiyan, SenseTime’s SenseNova, and 360’s 360 Zhinao. Baidu’s Wenxin Yiyan is a voice assistant that is designed to understand the Chinese language and culture. SenseTime’s SenseNova is similar to ChatGPT and is designed to provide natural language processing and AI chatbot services. 360 Zhinao is based on 360GPT, the company’s model for large-scale language processing.

The emergence of AI chatbots from Chinese cloud companies demonstrates the growing importance of AI in the tech industry. The competition between Chinese and American companies in this field is fierce, and it remains to be seen which AI chatbot will emerge as the market leader. However, it is clear that the trend towards AI assistants is likely to continue in the coming years. With the increasing demand for automation and efficiency in the business world, AI chatbots are poised to play an important role in helping businesses to optimize their operations and increase productivity.

One potential concern with the increasing adoption of AI chatbots is the potential displacement of human workers. While AI chatbots can certainly help businesses to automate certain tasks and increase efficiency, they are not a complete substitute for human workers. It is important for businesses to consider the potential impact of AI chatbots on their workforce and take steps to ensure that their employees are able to adapt and evolve in the face of changing technological trends.

Don’t forget to join our 18k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at [email protected]

🚀 Check Out 100’s AI Tools in AI Tools Club

Meet Alibaba’s ChatGPT Competitor Tongyi Qianwen: a Large Language Model that will be Embedded in its Tmall Genie Smart Speakers and Workplace Messaging Platform DingTalk Read More »

A New Study Proposes Automatic Taxonomic Identification Based On The Fossil Image Dataset (>415,000 images) And Deep Convolutional Neural Networks

Paleontology is a fascinating field that helps us understand the history of life on Earth by studying ancient life forms and their evolution. However, one of the major challenges in paleontological research is the labor-intensive and time-consuming taxonomic identification process, which requires extensive knowledge and experience in a particular taxonomic group. Moreover, identification results often must be more consistent across researchers and communities.

Deep learning techniques have emerged as a promising solution for supporting the taxonomic identification of fossils. In this context, a Chinese research team recently published an article exploring the potential of deep learning for improving taxonomic identification accuracy.

The main contribution of this paper is the creation and validation of a large and comprehensive fossil image dataset (FID) using web crawlers and manual curation. The dataset includes 415,339 images from 50 different clades of fossils, including invertebrates, vertebrates, plants, microfossils, and trace fossils. A convolutional neural network (CNN) was used to classify the fossil images and achieved high classification accuracies, demonstrating the potential of the FID for automated fossil identification and classification. The authors also made the FID publicly available for future use and development.
🚀 Check Out 100’s AI Tools in AI Tools Club

This study experimentally investigates the use of transfer learning with models trained on ImageNet to identify and classify fossils in the Fossil Image Database (FID). The authors found that freezing half of the network layers as feature extractors and training the remaining layers yielded the best performance. Data augmentation and dropout were effective methods to prevent overfitting, while frequent learning rate decay and large training batch sizes contributed to faster convergence and high accuracy. The study also examined the impact of imbalanced data on the algorithm and employed sampling methods for imbalanced learning. The dataset’s quality was important for accurate identification, with microfossils performing well due to the availability of high-quality images, while certain fossils with poor preservation and few samples performed poorly. The authors also found that the large intraclass morphological diversity of certain clades hindered identification accuracy due to the difficulty of the DCNN architecture in extracting discriminative characteristics.

The Inception-ResNet-v2 architecture achieved an average accuracy of 0.90 in the test dataset when using transfer learning. Microfossils and vertebrate fossils had the highest identification accuracies of 0.95 and 0.90, respectively. However, clades such as sponges, bryozoans, and trace fossils, which had various morphologies or few samples in the dataset, had identification accuracies below 0.80.

In conclusion, deep learning techniques, particularly transfer learning, have shown promising results in improving the accuracy and efficiency of taxonomic identification of fossils. The creation and validation of a large and comprehensive fossil image dataset, such as the Fossil Image Database (FID), is crucial for achieving high identification accuracy. Its availability for public use and development is beneficial for advancing the field of paleontology. However, the accuracy of deep learning models depends on the dataset’s quality and diversity, with certain clades posing challenges due to their intraclass morphological diversity or poor preservation. Further research and development in deep learning techniques and large-scale fossil image datasets are necessary to overcome these challenges and improve the accuracy and efficiency of paleontological research.

Moreover, deep learning techniques in paleontology can potentially transform the field beyond taxonomic identification. These techniques can extract more information from fossil data, such as the segmentation and reconstruction of fossils, integrating fossil data with other types of data, and detecting patterns and anomalies in large-scale fossil datasets. This expands our understanding of the history of life on Earth, paving the way for exciting discoveries and advancements.

Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 18k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

🚀 Check Out 100’s AI Tools in AI Tools Club

A New Study Proposes Automatic Taxonomic Identification Based On The Fossil Image Dataset (>415,000 images) And Deep Convolutional Neural Networks Read More »

Researchers From Google AI and UC Berkeley Propose an AI Approach That Teaches LLMs to Debug its Predicted Program via Few-Shot Demonstrations

Producing accurate code in a single effort for many programming jobs can be challenging. With several applications, including code synthesis from natural languages, programming by examples, and code translation, code creation has long been a problem. Recent big language models, in particular, have substantially improved over earlier deep neural networks. One line of research has developed reranking techniques to choose the best candidate from multiple samples, typically requiring tens of samples. These techniques were inspired by observations that correct code is much more likely to be predicted when various programs are sampled from the model.

It makes intuitive sense that a programmer’s first piece of code is usually inaccurate. Humans often examine the code, check into the execution outcomes, and then make adjustments to fix implementation flaws rather than entirely rejecting faulty code. Previous research has suggested deep learning algorithms to correct the anticipated code, which shows considerable performance improvements on various coding jobs. Nevertheless, these methods call for extra training for the code repair model.

Prior studies suggest that large language models are not yet able to correct code in the absence of external feedback, such as unit tests or human instructions, despite some recent studies showing that these models have the potential to generate feedback messages to critique and refine their outputs for some natural language and reasoning domains. In this study, researchers from Google Research and UCB offer SELF-DEBUGGING, using few-shot prompting to educate the huge language model on debugging its own projected code. SELFDEBUGGING commands the model to run the code, then create a feedback message based on the code and the execution outcome without needing extra model training.
🚀 Check Out 100’s AI Tools in AI Tools Club

SELF-DEBUGGING trains the model to detect the implementation issues by code explanation, in contrast to earlier studies on using human feedback for code repair, where the feedback message describes the code errors and how to correct them. This debugging procedure is akin to the rubber duck debugging technique used by human programmers. Describing the code to a rubber duck in normal language line-by-line improves debugging effectiveness without professional help. The entire SELF-DEBUGGING technique is shown in Figure 1. They assess the GPT-3 model family’s code-DaVinci-002 for SELF-DEBUGGING.

For a variety of code-generating tasks, such as text-to-SQL generation, code translation, and text-to-Python generation, SELFDEBUGGING delivers the most cutting-edge performance. With code explanation and no unit tests in the challenge description, the Spider benchmark for text-to-SQL generation shows that self-debugging reliably increases the baseline by 2–3% with varying numbers of beginning programs and increases prediction accuracy on the most complex SQL queries by 9%.

Using unit tests coupled with code explanation on TransCoder for code translation and MBPP for text-to-Python generation increases accuracy by up to 12%. In comparison, code explanation alone without debugging also regularly improves code translation performance by 2–3%. Self-debugging increases sample efficiency and can perform on par with or better than baseline models that sample more than 10 predictions. According to their research, teaching large language models to perform SELF-DEBUGGING without human supervision is another promising way to increase coding capability and lower the sampling cost needed to complete difficult tasks. This is in addition to improving their ability to generate code from scratch.

Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 18k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

🚀 Check Out 100’s AI Tools in AI Tools Club

Researchers From Google AI and UC Berkeley Propose an AI Approach That Teaches LLMs to Debug its Predicted Program via Few-Shot Demonstrations Read More »

How does GPT-4’s steerable nature set it apart from the previous Large Language Models (LLMs)?

The release of OpenAI’s new GPT 4 is already receiving a lot of attention. This latest model is a great addition to OpenAI’s efforts and is the latest milestone in improvising Deep Learning. GPT 4 comes with new capabilities due to its multimodal nature. Unlike the previous version, GPT 3.5, which only lets ChatGPT take textual inputs, the latest GPT-4 accepts text as well as images as input. GPT-4, with its transformer architecture, displays human-level performance because of its more reliable and creative nature compared to its predecessors.

When we talk about OpenAI’s GPT 4 model, it has been called more steerable as compared to the previous versions. Recently in a Twitter thread, an AI researcher named Cameron R. Wolfe discussed the concept of steerability in Large Language Models (LLMs), specifically in the case of the latest GPT 4. Steerability basically refers to the ability to control or modify a language model’s behavior. This includes making the LLM adopt different roles, follow particular instructions according to the user, or speak with a certain tone. 

Steerability lets a user change the behavior of an LLM on demand. In his tweet, Cameron also mentioned how the older GPT-3.5 version used by the well-known ChatGPT was not very steerable and had limitations for chat applications. It mostly ignored system messages, and its dialogues mostly constituted a fixed persona or tone. GPT-4, on the contrary, is more reliable and capable of following detailed instructions. 
🚀 Check Out 100’s AI Tools in AI Tools Club

In GPT-4, OpenAI has provided additional controls within the GPT architecture. System messages now let users customize the AI’s style and tasks desirably. A user can conveniently prescribe the AI’s tone, word choice, and style in order to receive a more specific and personalized response. The author has explained that GPT-4 is trained through self-supervised pre-training and RLHF-based fine-tuning. Reinforcement Learning from Human Feedback (RLHF) includes training the language model using feedback from human evaluators, which serves as a reward signal for evaluating the quality of the generated text. 

To make GPT-4 more steerable, safer, and less likely to produce false or deceptive information, OpenAI has hired experts in multiple fields to evaluate the model’s behavior and provide better data for RLHF-based fine-tuning. These experts can help identify and correct errors or biases in the model’s responses, ensuring more accurate and reliable output.

Steerability can be used in many ways, such as using GPT -4’s system message to make certain API calls. A user can command it to write in a different style or tone, or voice by stating prompts like “You are a data expert” and have it explain a data science concept. When set as a “Socratic tutor” and asked how to solve a linear equation, GPT-4 responded by saying, “Let’s start by analyzing the equations.” In conclusion, GPT-4’s steerability provides greater control over an LLM’s behavior, enabling more diverse and effective applications. It can still hallucinate facts and make reasoning errors, but it is still a very significant development in the AI industry. 

Check out the source. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 18k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

🚀 Check Out 100’s AI Tools in AI Tools Club

One of the main benefits of GPT-4 relative to prior models (like ChatGPT/GPT-3.5) is that the model is incredibly steerable. Here’s what this means and how you can use it to create better chat experiences… 🧵[1/8] pic.twitter.com/rBB1WUorRx— Cameron R. Wolfe (@cwolferesearch) April 10, 2023

How does GPT-4’s steerable nature set it apart from the previous Large Language Models (LLMs)? Read More »

Best Natural Language Processing (NLP) Tools/Platforms (2023)

An essential area of artificial intelligence is natural language processing (NLP). The widespread use of smart devices (also known as human-to-machine communication), improvements in healthcare using NLP, and the uptake of cloud-based solutions are driving the widespread adoption of NLP in the industry. But what is NLP exactly, and why is it significant?

Linguistics, computer science, and artificial intelligence all meet in NLP. A good NLP system can comprehend documents’ contents, including their subtleties. Applications of NLP analyze and analyze vast volumes of natural language data—all human languages, whether spoken in English, French, or Mandarin, are natural languages—to replicate human interactions in a human-like manner.

Why is NLP so essential?

We depend on machines more than ever since they allow us to be considerably more productive and accurate than we could ever be. However, there is a significant challenge with NLP activities. They are not worn out. They are uncomplaining. They are never bored.
🚀 Check Out 100’s AI Tools in AI Tools Club

The uniqueness of natural language and the uncertainty of languages make NLP a difficult area to work with. It is relatively easy for humans to learn a language, but it is quite difficult for machines to understand natural language. To provide structure to data deemed unstructured (i.e., Contrary to a record of a store’s transaction history, the text lacks a schema), we must first identify a solution that addresses the problems of linguistic creativity and ambiguity problems.

Tools for NLP projects

Many open-source programs are available to uncover insightful information in the unstructured text (or another natural language) and resolve various issues. Although by no means comprehensive, the list of frameworks presented below is a wonderful place to start for anyone or any business interested in using natural language processing in their projects. The most popular frameworks for Natural Language Processing (NLP) tasks are listed here without further ado.

NLTK

Natural Language ToolKit is one of the leading frameworks for developing Python programs to manage and analyze human language data (NLTK). The NLTK documentation states, “It offers wrappers for powerful NLP libraries, a lively community, and intuitive access to more than 50 corpora and lexical resources, including WordNet.” It also offers a suite of text-processing libraries for categorization, tokenization, stemming, tagging, parsing, and semantic reasoning.

Learning NLTK takes time, just like learning most things in programming. The book Natural Language Processing with Python, produced by the NLTK designers themselves, is one of many books available to help you in your quest to understand the framework. It provides a very useful method for writing code to solve Natural Language Processing issues.

Stanford CoreNLP

The Stanford NLP community created and actively maintains the CoreNLP framework, a well-liked library for NLP activities. NLTK and SpaCy were written in Python and Cython, respectively, whereas CoreNLP was written in Java, requiring JDK on your machine (but it does have APIs for most programming languages).

The creators of CoreNLP refer to it as “your one-stop shop for natural language processing in Java!” on the website. Token and sentence borders, parts of speech, named entities, numerical and time values, dependency and constituency parser, sentiment, coreference, quote attributions, and relations are just a few of the linguistic annotations that may be derived for text by using CoreNLP. Arabic, Chinese, English, French, German, and Spanish are among the six languages that CoreNLP currently supports.

The fact that CoreNLP is highly scalable makes it a top choice for difficult tasks, which is one of its key advantages. It was designed with speed in mind and has been tweaked to be exceptionally quick.

SpaCy

It is a library that may be used with both Python and Cython. It is a development of NLTK that incorporates word vectors and pre-trained statistical models. Tokenization is now supported for more than 49 languages.

This library can be regarded as one of the best for working with tokenization. The text can be broken into semantic units like words, articles, and punctuation.

All of the functionality needed for projects in the real world is present in SpaCy. Of all the NLP software now on the market, it also boasts the quickest and most precise syntactic analysis.

GPT-3

GPT-3 is a new tool that Open AI recently released. It is sturdy while also being fashionable. Since text prediction is its primary usage, it is an autocompleting application. GPT-3 will generate something similar but distinctive based on several instances of the desired text.

Open AI is always working on the GPT project. The third version is nice. One huge advantage is the enormous amount of data it was pre-trained on (175 billion parameters). If you employ it, you can produce more similar results to spoken language.

Apache OpenNLP

Accessibility is crucial when using a tool for extended periods, yet it is tough to find in open-source natural language processing technology. Despite having the required capability, it might be too challenging to utilize.

Apache OpenNLP is an open-source library for people who value practicality and accessibility. Like Stanford CoreNLP, it uses Python decorators and Java NLP libraries.

OpenNLP is a simple but effective tool in contrast to the cutting-edge libraries NLTK and Stanford CoreNLP, which have a wealth of functionality. It is among the finest solutions for named entity recognition, sentence detection, POS tagging, and tokenization. Additionally, you can modify OpenNLP to meet your needs and eliminate unnecessary features.

Google Cloud

The Google Cloud Natural Language API offers several pre-trained models for sentiment analysis, content categorization, and entity extraction. AutoML Natural Language is another feature that enables you to build custom machine learning models.

It uses Google’s question-answering and language-comprehension tools as part of the Google Cloud architecture.

Text Blob

It is the market’s quickest machine-learning tool. Another readily accessible NLTK-based natural language processing tool is Text Blob. This might be enhanced with extra features that allow for more textual information.

Text Blob sentiment analysis can be used for customer contact through speech recognition. Additionally, you may develop a model using a trader’s linguistic expertise from Big Business.

Standardizing content is becoming usual and advantageous. It would be great if your website or application could be automatically localized. A machine translation feature in Text Blob is another helpful feature. To enhance machine translation, use the Text Blob language text corpora.

Amazon Comprehend

The Amazon Web Services architecture includes the natural language processing (NLP) service Amazon Comprehend. Sentiment analysis, topic modeling, entity recognition, and other NLP applications can all be made using this API.

From emails, social media feeds, customer service tickets, product reviews, and other sources, it extracts relevant information from text. Extracting text, keywords, subjects, sentiment, and additional information from documents like insurance claims may help simplify document processing operations.

IBM Watson

A group of artificial intelligence (AI) services known as IBM Watson are housed on the IBM Cloud. Natural language understanding is one of its important features, which enables you to recognize and extract words, groups, emotions, entities, and more.

It’s flexible since it can be adjusted to various industries, from banking to healthcare, and it includes a library of papers to get you started.

AllenNLP

Strong text preprocessing abilities in a prototyping tool. SpaCy is more production-optimized than AllenNLP, but research uses AllenNLP more frequently. Additionally, it is powered by PyTorch, a well-liked deep-learning framework that offers far more flexibility for model customization than SpaCy.

BERT

Bidirectional Encoder Representations from Transformers are known as BERT. It is a pre-trained Google algorithm created to predict what users want more accurately. Contrary to earlier contextless methods like word2vec or GloVe, BERT considers the words immediately adjacent to the target word, which might obviously change how the word is interpreted.

GenSim

The canon is a collection of linguistic data. Regardless of the size of the corpus, it has a variety of methods that may be applied. A Python package called Gensim was made with information retrieval and natural language processing in mind. This library also features outstanding memory optimization, processing speed, and efficiency. Before installing Gensim, NumPy and SciPy, two Python packages for scientific computing, must be installed because they are required by the library.

Word2Vec

A word is represented as a vector by word embedding. Using their dictionary definitions, words are transformed into vectors that may be used to train machine learning (ML) models to recognize similarities and differences between words. An NLP tool for word embedding is called Word2Vec.

CogCompNLP

A tool created at the University of Pennsylvania is called CogCompNLP. It is available in Python and Java for processing text data and can be stored locally or remotely. Some of its features are tokenization, part-of-speech tagging, chunking, lemmatization, semantic role labeling, etc. Big data and remotely stored data are both workable with it.

Don’t forget to join our 18k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at [email protected]

🚀 Check Out 100’s AI Tools in AI Tools Club

Best Natural Language Processing (NLP) Tools/Platforms (2023) Read More »

AI in Travel: How is AI Reforming the Travel Industry

This post was originally published on this site Artificial Intelligence Insights Published at: April 14, 2023 Last Updated: March 29, 202311 views The travel industry has seen a significant transformation in recent years, thanks to the incredible advancements in Artificial Intelligence (AI). From travel planning to landing in the destination, AI-powered tools and solutions are

AI in Travel: How is AI Reforming the Travel Industry Read More »

A New Microsoft AI Research Shows How ChatGPT Can Convert Natural Language Instructions Into Executable Robot Actions

Large language models (LLMs) that can comprehend and produce language similar to that of humans have been made possible by recent developments in natural language processing. Certain LLMs can be honed for specific jobs in a few-shot way through discussions as a consequence of learning a great quantity of data. A good example of such an LLM is ChatGPT. Robotics is one fascinating area where ChatGPT may be employed, where it can be used to translate natural language commands into executable codes for commanding robots. Robot program generation from natural language commands is a desirable aim, and there are several extant studies, some of which are based on LLMs.

Unfortunately, the majority of them lack the human-in-the-loop capability, were built in a constrained scope, or are hardware-dependent. However, the majority of this research relies on particular datasets, making it necessary to recall data and retrain models in order to adapt or expand them to various robotic situations. A robotic system that is easily adaptable to multiple applications or operating circumstances without needing a significant amount of data gathering or model retraining would be excellent from the perspective of practical use. The benefit of adopting ChatGPT for robotic applications is that they may start with a modest amount of sample data to adjust the model for particular applications and make use of its language recognition and interaction capabilities as an interface.

Figure 1: Demonstrates real-world cues that ChatGPT can use to translate multi-step human instructions into actionable robot sequences that may be carried out in diverse settings.

Although ChatGPT’s potential for robotic applications is getting attention, there is currently no proven approach for use in practice. In this study, researchers from Microsoft give a concrete illustration of how ChatGPT may be applied in a few-shot situation to translate natural language commands into a series of actions that a robot can carry out (Fig.1). The prompts were created with the goal of meeting the specifications typical of many real-world applications while also being set up to be easily adaptable.
🚀 Check Out 100’s AI Tools in AI Tools Club

To meet these requirements, they designed input prompts to encourage ChatGPT to 1) Output a sequence of predefined robot actions with explanations in a readable JSON format. 2) Represent the operating environment in a formalized style. 3) Infer and output the updated state of the operating environment, which can be reused as the next input, allowing ChatGPT to operate based solely on the memory of the latest operations. They conducted experiments to test the effectiveness of their proposed prompts in inferring appropriate actions for multi-stage language instructions in various environments. They listed the following requirements for this paper: 1) Simple interaction with robot execution systems or visual recognition software. 2) Suitability for diverse domestic settings. 3) The capacity to deliver any number of plain-English instructions while reducing the effect of ChatGPT’s token restriction.

They also noted that ChatGPT’s conversational capabilities enable users to modify its output using natural language feedback, which is critical for creating an application that is both secure and resilient while offering a user-friendly interface. The collection of robot actions, environment representation, and object names are all easily modifiable and may be used as templates in the suggested prompts. This paper’s contribution is to create and disseminate generic prompts that are easily adaptable to each experimenter’s needs, giving the robotics research community useful information. They are open-source and freely accessible on GitHub, along with their usage prompts.

Check out the Paper and Github. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 18k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

🚀 Check Out 100’s AI Tools in AI Tools Club

A New Microsoft AI Research Shows How ChatGPT Can Convert Natural Language Instructions Into Executable Robot Actions Read More »

Researchers At Stanford Have Developed An Artificial Intelligence-Based Approach To Optimize Road Tolls

Cities worldwide are plagued by traffic congestion, which not only results in lost productivity but also contributes to increased carbon emissions and noise pollution. To address this issue, congestion pricing has been proposed as a potential solution. Congestion pricing entails charging tolls for the use of busy roads to encourage drivers to avoid crowded areas and rush hours. However, the appropriate tolls for efficiently reducing traffic remain a challenge. The collection of user trip attributes such as origins and destinations for this purpose is difficult and raises privacy concerns.

Researchers at Stanford University have developed an innovative approach to optimize road tolls using artificial intelligence. This method involves dynamically adjusting tolls based on the number of cars traveling on certain roads at specific times to balance roadway supply and driver demand. This approach has the potential to improve congestion pricing systems in various cities worldwide.

Without requiring extra user trip information, the researchers used online learning, a branch of machine learning and artificial intelligence, to modify road tolls based on observations of motorist behavior. By optimizing road tolls, our technique protects user privacy while easing traffic congestion. The researchers found that the only data points required to determine the supply and demand for roads are the total number of cars on the road at any given moment, information that is already available in cities thanks to contemporary sensing technology like loop detectors.
🚀 Check Out 100’s AI Tools in AI Tools Club

Through the independent acts of choosing one road over another, drivers reveal aggregate preferences, enabling congestion pricing tolls to be increased on congested roads, thereby incentivizing travelers to take alternate routes or other modes of transportation. The online learning-based approach modifies tolls based only on observed aggregate flows on the transportation network’s routes at each time period.

To validate the performance of their approach, the researchers compared it to an all-knowing “oracle” with complete information on users’ trip attributes. Testing the new approach on real-world traffic networks, the researchers observed that it outperformed even several traditional congestion pricing methods.

This research builds on previous work by the lead author and his colleagues, focused on ensuring equity of congestion pricing. That study proposed a redistributive approach where lower-income drivers receive more money back than they pay out in tolls, while wealthier drivers’ compensation is mostly in the form of time not spent in traffic jams.

Moving forward, the researchers aim to combine the equitable approach to congestion pricing developed in the 2021 paper with the learning-based approach used in the new study. They aim to further explore the design of incentive schemes for future mobility systems that consider equity and efficiency while reducing traffic congestion costs to society.

In conclusion, the researchers’ innovative approach to optimizing road tolls using artificial intelligence has promising potential to reduce traffic congestion and improve the efficiency of congestion pricing systems in cities worldwide. This approach preserves user privacy while dynamically adjusting tolls based on observed driver behavior, which could help minimize total traffic congestion costs to society while also considering societal considerations such as equity.

This article is based on this Stanford article. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 18k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

🚀 Check Out 100’s AI Tools in AI Tools Club

Researchers At Stanford Have Developed An Artificial Intelligence-Based Approach To Optimize Road Tolls Read More »

Microsoft AI Open-Sources DeepSpeed Chat: An End-To-End RLHF Pipeline To Train ChatGPT-like Models

There is no exaggeration in saying that ChatGPT-like concepts have had a revolutionary effect on the digital world. For this reason, the AI open-source community is working on some projects (such as ChatLLaMa, Alpaca, etc.) that aim to make ChatGPT-style models more widely available. These models are extremely flexible and can execute tasks such as summarization, coding, and translation at or above human levels of expertise.

Despite these impressive efforts, a publicly available end-to-end RLHF pipeline can still not train a robust ChatGPT-like model. Training efficiency is frequently less than 5% of these machines’ capabilities, even when access to such computing resources is available. Despite access to multi-GPU clusters, existing systems cannot support the simple, fast, and inexpensive training of state-of-the-art ChatGPT models with billions of parameters.

These restrictions originate from the fact that the sophisticated RLHF training pipeline used by InstructGPT is not well-supported by existing DL systems, which are optimized for more conventional pre-training and fine-tuning pipelines. To make ChatGPT-like models more widely available and RLHF training more easily accessible, the Microsoft team is releasing DeepSpeed-Chat, which offers an end-to-end RLHF pipeline to train ChatGPT-like models. It has the following features:
🚀 Check Out 100’s AI Tools in AI Tools Club

1. A Convenient Environment for Training and Inferring ChatGPT-Similar Models: InstructGPT training can be executed on a pre-trained Huggingface model with a single script utilizing the DeepSpeed-RLHF system. This allows user to generate their ChatGPT-like model. After the model is trained, an inference API can be used to test out conversational interactions.

2. The DeepSpeed-RLHF Pipeline: The DeepSpeed-RLHF pipeline largely replicates the training pipeline from the InstructGPT paper. The team ensured full and exact correspondence between the three steps a) Supervised Fine-tuning (SFT), b) Reward Model Fine-tuning, and c) Reinforcement Learning with Human Feedback (RLHF). In addition, they also provide tools for data abstraction and blending that make it possible to train using data from various sources.

3. The DeepSpeed-RLHF System: Hybrid Engine (DeepSpeed-HE) for RLHF is a powerful and sophisticated system that combines the training and inference capabilities of DeepSpeed. The Hybrid Engine can easily switch between RLHF’s inference and training modes, taking advantage of DeepSpeed-Inference’s optimizations like tensor-parallelism and high-performance transformer kernels for generation, as well as RLHF’s many memory optimization strategies like ZeRO and LoRA. To further optimize memory management and data transfer across the various stages of RLHF, DeepSpeed-HE is additionally aware of the whole RLHF pipeline. The DeepSpeed-RLHF system achieves unprecedented efficiency at scale, allowing the AI community to quickly, cheaply, and conveniently access training on complex RLHF models.

4. Efficiency and Affordability: Because DeepSpeed-HE is over 15 times quicker than conventional systems, RLHF training may be completed quickly and cheaply.

5. Excellent Scalability: DeepSpeed-HE’s strong scalability on multi-node multi-GPU systems allows it to accommodate models with hundreds of billions of parameters.

6. Expanding Access to RLHF Education: DeepSpeed-HE enables data scientists without access to multi-GPU systems to build not just toy RLHF models but massive and powerful ones that can be deployed in real-world settings, all with just a single GPU for training.

The researchers have included a whole end-to-end training pipeline in DeepSpeed-Chat and modeled it after InstructGPT to make the training process as streamlined as possible.

The production process consists of three stages:

1. The pretrained language models are fine-tuned via supervised fine-tuning (SFT), in which human responses to various inquiries are carefully selected.

2. Next, the team performs “reward model fine-tuning,” which involves training a different (often smaller than the SFT) model (RW) using a dataset that includes human-provided rankings of numerous answers to the same inquiry.

3. Lastly, in RLHF training, the Proximal Policy Optimization (PPO) algorithm is used to further adjust the SFT model with the reward feedback from the RW model.

The AI community can now access DeepSpeed-Chat thanks to its open-sourced nature. On the DeepSpeed GitHub website, the researchers invite users to report issues, submit PRs, and participate in discussions.

Check out the Code. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 18k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

🚀 Check Out 100’s AI Tools in AI Tools Club

Microsoft AI Open-Sources DeepSpeed Chat: An End-To-End RLHF Pipeline To Train ChatGPT-like Models Read More »

Scroll to Top