AI Tools at Salsita

This article provides insights into Salsita's utilization and future plans for AI tools. Learn how we harness these tools in our operations, the reasons driving their adoption, and our strategies for addressing associated concerns.

Jiří Staniševský - Chief Technical Officer

Table of Contents

Introduction

Salsita has always kept pace with modern technologies, evaluating emerging ones and discerning when it makes sense to be an early adopter versus when the risk is too high. Ever since AI-powered tools became available, we've kept a close eye on the field and have evaluated several of them over the past year. We're confident that these tools are revolutionizing the software industry, and will continue to do so for the foreseeable future.

To put it rather bluntly, we have only two choices: adapt or make way for those who have. The decision is remarkably straightforward.

The novelty of truly useful AI tools is reflected in how poorly their benefits and drawbacks are understood, a subject this article aims to address. The subsequent text outlines what drives Salsita to use these tools, delves into prevailing concerns surrounding them, and explains how we mitigate or minimize these concerns.

Motivation

The capabilities of recent AI tools, especially GPT-4, are astounding, yet the day when they can work autonomously and replace human beings in the software development process is still on the horizon. Envisioning a tool that, when instructed, can write a complex piece of code that functions properly - let alone matches the quality of Salsita's engineers - remains a bit far-fetched. However, a shovel doesn't dig a hole by itself, but it is invaluable in the process. In this case, we're talking about an extraordinarily sharp shovel.

The most apparent benefit of using an AI tool is the significant productivity boost it offers when utilized properly. Like any tool, the goal of AI is to remove the monotonous aspects of your job. A prime example is GitHub Copilot, which suggests relatively large portions of code based on the intent you specify. The code then only needs verification and possible adjustment, instead of being written from scratch or copied from other sources.

The critical point to remember is that you still need the competency to execute the task independently; the tool can then amplify your efficiency to a substantial degree. Failure to meet this prerequisite can lead to serious problems - see the Concerns section below.

For our clients, the increased efficiency translates into an improved value-for-money ratio, meaning reduced costs and quicker delivery times. The individual tool descriptions below outline how each contributes to the success of our clients’ projects. While we don't have enough data yet to gauge the productivity boost accurately, our preliminary evaluation and external research suggests it could be quite significant.

During a study, GitHub asked 95 developers to complete the same task and found that the group using GitHub Copilot required, on average, half the time the other group needed. This implies a 100% productivity boost for that specific task. Naturally, the degree of enhancement depends on the task's complexity, verbosity, and the developer’s experience. In addition to these tangible metrics, the study also reported higher developer satisfaction and improved focus.

Our early evaluations align with these findings, particularly among our senior staff who report ”feeling twice as productive” when implementing familiar and repetitive tasks. Conversely, more junior developers reported a lesser, though still notable, gain. We are optimistic that we can further amplify these benefits through deeper research, knowledge sharing, and employee training.

In addition to impacts on productivity, we also need to remember that the wider these tools are spread, the more users will depend on them and expect to have them readily available. Staffing a project that prohibits the use of these tools might soon become complicated, as people naturally prefer to simplify their work. Consequently, if Salsita fails to provide the expected tools across too many of our projects, we could struggle to retain and recruit skilled engineers.

Concerns

Privacy

Some believe that the strength of current AI tools lies in generating content - drafting a letter, creating an image, or writing a piece of code. However, the reality is that these tools' most significant advantage is their ability to understand context and subsequently generate relevant, and therefore useful, content.

When drafting a letter, along with instructing the tool, you also need to provide a context regarding the letter's theme. You outline topics you wish to elaborate on, specify the recipient, define who you are, etc. The tool then attempts to "understand" the context and infer missing details - such as the tone of the message. A letter between business partners will vastly differ from a love letter, even if they both reference a productive dinner from the previous evening.

The same principle applies to code generation. There's a substantial difference in utility between blindly suggesting an implementation of a function outside the context of the application you're writing, and a personalized suggestion that aligns with your coding style, tech stack, abstraction methods, and even pre-existing variables and functions in your code.

For AI tools to be beneficial, they require a significant amount of information - context - that they can process. The majority of today's tools are cloud-based, meaning the actual magic (referred to as inference) occurs on a server hosted by the provider. This inevitably raises questions about how the context is used, whether it's stored longer than necessary for the inference, and if so, how securely it's protected.

The answer to these questions varies for each tool. Generally, despite most vendors' preferences to use the provided information for additional training of their neural networks, there are options to opt out. Specific privacy policies for each tool are outlined in their dedicated sections.

Ethics

All modern AI tools, built upon neural networks, are trained on massive amounts of data. This data is the result of someone's effort and, despite being publicly available on the internet, was not necessarily intended for extensive reuse. A significant portion of this content is also licensed under varying levels of restriction. However, when GitHub Copilot suggests a piece of code, it doesn't come with an appended license, a donation link, or credit for the author. One could view this form of content reuse as immoral or even illegal.

Fortunately, neural networks, similar to human brains, rarely retain exact pieces of information. Instead, as the model is trained, it abstracts the precise information, retaining only the conceptual essence. When asked to provide an answer, it melds these abstractions with the given context to synthesize a new response rather than replaying the original information. There are instances where there's minimal room for abstraction and/or the query posed has a very specific answer. In such cases, the model may synthesize the original information, or something very close to it. Under these circumstances, the question regarding the morality or legality of reusing the information is valid.

That being said, the likelihood that the model synthesized the exact licensed content is very low. GitHub Copilot takes this issue seriously and provides an option to activate a filter that cross-references every generated suggestion against all public code hosted on GitHub, skipping any suggestion that is an "exact" match.

Quality & Security

The data used for training the models is, of course, not flawless. The sheer volume of it helps because rare errors get drowned out in the model's "memory," but frequent mistakes persist. In addition to this, the tool may not always reliably comprehend the instructions or the context, sometimes due to its inherent limitations, but more often due to imperfect input. These factors combined can easily yield incorrect results. As is the case with virtually any tool, it's crucial that AI tools are used by people qualified to operate them - those who can review suggestions and spot errors. Blindly relying on the correctness of the results is a serious mistake.

Much like anyone with the ability to modify a code base, an AI tool can introduce vulnerabilities into it. The tool would obviously not do this intentionally (unless explicitly instructed). Issues of this nature are introduced for the same reasons as any other issue - because the patterns leading to the vulnerability are frequently present in the training data. In addition to this, it's crucial to understand that the tool only suggests chunks of code. While it tries to maintain consistency with the rest of the code, it can't ensure that the chunk is properly and safely integrated into the application. This underscores the need for an experienced operator to review the suggestions before implementing them.

Hallucinations

Hallucinations are a specific type of incorrect result produced by a generative neural network, where the network generates information that doesn't occur in the training data. As mentioned earlier, the network doesn't "store" all the information it encounters during training. It merely saves its abstract representation as a kind of rule. When synthesizing an answer, the network applies these rules, and their application can sometimes lead to erroneous, yet highly believable answers - hallucinations. Regrettably, hallucinations are challenging to identify as they appear just as credible as correct answers. GPT-3 is notorious for its tendency to hallucinate. While GPT-4 performs much better, using the model to look up specific factual information is still risky.

GitHub Copilot

GitHub Copilot is primarily an intelligent code completion tool integrated with the developer's IDE (editor). It suggests code snippets at the cursor, based on the surrounding code and related files. Copilot is the most pertinent AI tool for developers today and is rapidly becoming a standard feature in any developer's toolkit. Coding without Copilot will soon feel akin to writing code in Notepad.

Contrary to what the name suggests, GitHub Copilot can operate on any codebase, regardless of the version control software or hosting service used. In fact, it doesn't require any version control at all.

Highlights

Smart Completion: Unlike standard code completion features in IDEs, Copilot suggests whole chunks of code, which significantly reduces the amount of code to type, especially when using verbose frameworks with a lot of boilerplate. Our evaluation shows that the accuracy of suggestions is impressively good, and consistency with the existing codebase is also excellent.

2nd Memory: For most modern developers, stackoverflow.com has become an extension of their own memory. We repeatedly search for the same snippet, technique, function name because the overhead of searching is not significant enough to clog our memory with infrequently used information. With Copilot, things are even simpler - just leave a comment in your code. Copilot understands that it should suggest code that matches the comment, combines it with the overall context, and suggests the implementation. Unlike looking up the snippet on the internet, the implementation is already personalized to match your codebase. Sometimes, not even a comment is needed; a declared variable might be all it takes. Have a list of employees and want only those located in the Prague office? Just declare a variable employeesFromPrague and let Copilot suggest how to populate it using a filter expression.

Inline Documentation: Drafting a documentation block for a function or annotating code has never been easier. You simply start a comment block, and Copilot will attempt to summarize what happens within the code below.

Learning: Regardless of a developer's seniority, there's always room for learning. As Copilot assists with rudimentary typing, it often suggests alternative approaches to problems, which can be superior to what the developer initially intended. Of course, the developer needs to be skilled enough to assess the correctness of the suggestion.

Privacy

GitHub Copilot for Business, which Salsita is currently evaluating, does not store either the context or the generated suggestions for any purpose. The only data it collects is "user engagement" data - events that occur in the editor when interacting with the tool, such as requesting a suggestion, receiving one, accepting it, and so forth. This user engagement data encompasses a broad range of metrics and tracks the user, but does not gather the text of prompts or suggestions. There is no way to opt out from submitting the user engagement data, but as there is no client information involved, this should not pose any obstacle.

Ethics

Copilot offers the capability to filter out suggestions that match public code found on GitHub. It will not suggest any content that is an "exact" match with the public code.

Quality & Security

There's no way the tool could alter the codebase without the developer knowing about it. Code produced using Copilot undergoes every check and review process that regular human-written code does.

ChatGPT

There is probably no need to introduce ChatGPT nowadays. As a chat-like interface, it provides access to the underlying GPT language model, in as raw form as a regular user can get. The tool, despite its limitations, is truly universal and can help with various generic tasks. Some examples being:

  • Proofreading - fixing grammar and improving style
  • Expansion - given a topic and/or important facts, drafts full-blown texts in selected style
  • Summarization - given a exhaustive text, extracts important information
  • Querying knowledge - using natural language questions

The large language models nicely show how similar a programming language is to a natural one. ChatGPT performs very well on programming languages allowing us to proofread, expand, summarize and even query knowledge just like with natural language. The upcoming GitHub Copilot X (based on GPT-4) and planned IDE integration improvements might render standalone ChatGPT useless for developers, but until then it remains their good companion.

Highlights

Editing Natural Language Texts: Process descriptions, requirement specifications, acceptance criteria, test cases – all of these are texts written in natural language. With appropriate instructions, ChatGPT can save a significant amount of time, enhance readability, and make the text appear more professional. Fun fact: The text you're currently reading has been proofread by ChatGPT.

Code / Specs to Documentation: You can input a function, class, or even an entire file, and request documentation at various levels of detail and technical depth. You can even input an OpenAPI specification that describes your API and obtain a highly reasonable summary.

Knowledge Base with Personalized Explanations: Ever wondered how to describe an API endpoint in an OpenAPI specification? All you need to do is ask: “How would I declare an API endpoint, which would change the user’s email, in an OpenAPI spec?”. Obviously, the result will be a draft that needs to be polished and adapted to match the rest of the code-base, but as a starting point, this can be invaluable. Also, since the tool has a chat interface, you can ask for additional information not present in the original answer, or supply additional context you forgot to provide earlier, and the tool will adapt the suggestion to better suit your needs.

Alternative for “Lorem Ipsum” Generator: After almost 500 years of unrivaled use, “Lorem Ipsum,” the popular pseudo-Latin placeholder text, is finally facing a worthy competitor. ChatGPT can produce random yet coherent text that aligns with the chosen topic in mere seconds. All it requires is a prompt such as: "Generate a random electronics product name along with a one-paragraph description." As ChatGPT comprehends more than just natural language, you can also instruct it to generate random data in JSON format. All it requires is a template, topic, and the size of the sample you wish to obtain.

Quality & Security

Just like with Copilot or any other tool really, quality and security is in the hands of the user. ChatGPT only responds with text and snippets. It’s the responsibility of the user to assess the safety and quality of the suggestions and decide whether they should be integrated into the final work. When looking up information, the user needs to be aware of the risk of hallucinations and verify the information accordingly.

In Salsita we use ChatGPT as an advisor and there’s no plan to integrate it into any process where it would perform actions autonomously.

Privacy

By default, OpenAI (the vendor of ChatGPT) stores all communication with the chatbot as training material for their neural network. However, users can opt out of this in their account settings, at the expense of losing chat history. The planned ChatGPT for Business will allow global opt-out from data sharing, but until then, Salsita pledges to ensure that all personnel using the tool with client know-how will have the sharing feature, and consequently history, disabled.

Notion AI

We use Notion as our primary company workspace, where we share vital information about Salsita with our employees. We also store project-related information in Notion for some of our clients. Recently, Notion introduced an AI-powered editing assistant, boasting capabilities similar to ChatGPT but tightly integrated with the application. We utilize this tool to summarize meeting notes, extract action points, proofread, expand, and generally refine texts.

Privacy

Notion AI analyzes only the currently edited page. The tool doesn't persist or process the input text or the generated content for anything beyond performing the desired action. No text is retained by the tool.

Midjourney

Apart from those based on large language models, there are other machine learning-powered tools relevant to our business. The most pertinent ones are tools that generate visual assets. Although there are good open-source solutions available, led by Stable Diffusion, cloud services generally offer simpler interfaces, thus broadening their user base and providing impressive results without a need for any expert knowledge. Midjourney, the most promising tool from the cloud category, is currently undergoing active development, having recently made a significant leap forward with its version 5 release.

Highlights

Asset Generation: Midjourney is capable of providing stylized illustrations and near photorealistic images that can be integrated into our clients' products. A recent feature addition allows you to select one image as a template and ensure all other images generated maintain the same style. This greatly enhances the tool's usability for applications and web presentations, where visual consistency is crucial. Fun fact: The cover image of this document was entirely generated by Midjourney.

Ideation: The complex assets generated by the tool aren't always flawless. Significant artifacts can appear, with an extra finger on a hand being a mild example. Although these artifacts may prevent direct asset use, they don't inhibit users from assessing the asset's aesthetics. Given that generating a new image variant takes mere seconds compared to the hours a human would need, the tool proves invaluable for rapid prototyping and experimentation. Once the best variant is chosen, the user can recreate the asset manually to circumvent any glitches.

Quality & Security

Apart from occasional displeasing visuals, unrelated content, and inconsistent style, there are no other risks associated with using the generated content as they're simply raster images. A review of the generated assets is obviously required before incorporating them into a product.

Privacy

By using Midjourney, you authorize Midjourney Inc. to process the submitted prompts and generated images in any way they see fit. Images generated on standard plans are automatically published (along with the input prompts) on their site, and anyone can access them. The Pro plan offers a so-called "Stealth mode", which keeps both the prompts and images inaccessible to other users, although this doesn’t prevent Midjourney Inc. from accessing them. It's important to understand that, unlike with GitHub Copilot or ChatGPT, the prompts for Midjourney don't contain a significant amount of sensitive information or know-how.

Stable Diffusion

As an open-source alternative to Midjourney, Stable Diffusion covers a broad set of use-cases. Unlike its cloud counterparts, it doesn't come with a default user interface (although there are some community-made ones) and therefore requires more technically skilled personnel to set it up. In exchange, however, you gain a vast array of settings and tweaks to optimize the output. The significant advantages of Stable Diffusion include:

  • Fine-tuning / custom training - enabling the model to concentrate on your domain and/or learn how to generate unknown elements
  • Modularity - community-provided modules considerably expand the capabilities of the base model
  • On-premise - both the inference and any additional fine-tuning occur on your device
  • Free - there's no license fee attached to the model, and no subscription is required

Highlights

Beyond simple prompt-to-image tasks, Stable Diffusion provides the user with greater control over how the result image is produced. Through a processing pipeline known as ControlNet, one can combine various features to achieve their final goal.

Feature Extraction and Expansion: A wide range of modules can extract specific features from an image and apply them to newly generated ones. Such features can denote style, pose of a person, or even a scribble abstraction of the entire image. The extracted features, combined with a text prompt, can generate entirely new images matching the desired properties.

Scribble Expansion: Ever wondered how your hand-drawn picture would look if it was a photo-realistic image? When properly instructed, Stable Diffusion can take your scribble and expand it into an image, while following the text prompt.

Editing: Unlike Midjourney, Stable Diffusion permits editing of existing images. For instance, you can select areas of an existing image where you want to generate different content, while keeping the rest of the image intact. Alternatively you can change the aspect ratio of an image by extending its background.

Quality & Security

The quality of the generated image might often seem inferior to what Midjourney provides. Yet, when all the configuration options are used correctly by an experienced user, the model can yield impressive results. Just like with any other tool, a human user ultimately determines the quality.

Privacy

There are no privacy concerns associated with Stable Diffusion as the entire model can run on a local machine. Fine-tuning the model doesn't need to involve any cloud service either.

Summary

It won't be long before various AI tools become standard equipment for any developer who aspires to remain competitive in the market. At Salsita, we can't afford to ignore these tools; instead, we need to embrace them and strive to maximize their benefits sooner than our competitors.

As we are just at the beginning of this journey, we can't promise any specific increases in productivity to our clients. However, recent evaluations suggest that these benefits could be quite substantial. As time goes on and we acquire more experience with the tools, we will surely optimize how we use them to maximize the productivity boost.

Code & ToolsCompany CultureJavaScript EngineeringProduct ManagementProject ManagementUX & Design

Jiří Staniševský - Chief Technical Officer

Jiří finds joy in desinging interesting architecture. Although his love for coding has evolved into a platonic relationship, he seizes any opportunity for hands-on engagement as a precious rarity.


Talk To Our Spicy Experts