A Note from your Friendly Neighborhood LSP about ChatGPT

Thank you to UVA IT and the IT Security Community for sharing their experiences and recommendations for this article

ChatGPT (a Generative AI tool) and similar large language models (LLM) applications are transforming the way we work. They have the potential to automate tasks, improve decision-making, and provide valuable insights.

The use of such tools also presents new challenges in terms of information security and data protection. We want to explicitly recommend to employees who choose to use GenAI tools at UVA; to do so in a secure, responsible, and legal manner. In that vein, here are best practices to keep in mind:

Only use the minimum data needed for the query, and before doing that ask yourself the question "Would I be comfortable sharing this information outside of the university? Would it be okay with this information being on a public site such as a social network or blog?". Plan as if data privacy will be breached or reused by the App for future training and responses. 
Employees should refrain from uploading or sharing any data within GenAI tools that is confidential, proprietary, or protected by regulation or the UVA Data Policy (e.g. FERPA, HIPAA or information deemed highly sensitive by the University) :

Don’t copy and paste emails or documents in prompts to ChatGPT and similar AI tools that generate content.
Examples of highly sensitive data: Personal information such as driver’s license number, SSN, Student ID, Bank Account number, or Passport number.
Examples of FERPA records: Any ‘Education Record’ or information that is directly related to a student and maintained by UVA.

After the jump, view additional considerations put together by InfoSec and UVA IT Directors.

UVA Information Security Questions to consider when using AI

Does the system learn from the questions you ask? Are you
leaking data in your question? Do not
put any UVA data into these systems.
Liability issues. UVA
does not have a contract with any GPT vendor, yet. If you have signed up for an account, you have
done so as a private individual and are solely responsible for what happens
with the account.
Remember the days of garbage in, gospel out? This applies to how the engines are
trained. If there is a bias in the
training, you will get biased data. However,
do not make the mistake of thinking all data produced by an AI is gospel.
Because these tools are so new, there are a lot of legal
issues that need to be addressed. For example, there are still questions about
the images these tools create. Is it creating a new image or is it using
existing images? If the latter, is the image copyrighted? Who is responsible
for copyright infringement?
Not all answers are accurate or appropriate. If you get an
answer, you still need to verify it. Example: ChatGPT produced a Splunk query that does not work. When given that feedback, ChatGPT identified that the two
commands given do not work together and rewrote the query. ChatGPT knew about the incompatibility when challenged but did not use this
knowledge in the originally produced code.
If you are having
code produced, know what it is doing. If
it is importing a library, know the source of the library and what it does.
Blindly putting code in place is introducing unacceptable risk.
Know what it does not do.
In all the code that ChatGPT has produced in example queries for UVAIT, there has not been any
code that would apply security controls (e.g., sanitizing input), which should
be included in any code we write. Using
this insecure code in publicly facing applications could cause significant
damage when exploited.

What are the security concerns with using ChatGPT?

While ChatGPT and similar language models offer great potential for various applications, there are several security concerns that should be considered:

Privacy: When using ChatGPT, users need to provide inputs, which may include personal or sensitive information. There is a risk that this information could be stored or used inappropriately. It's important to ensure that proper data handling and privacy policies are in place to protect user data.
Data Bias: Language models like ChatGPT are trained on large datasets that contain biases present in the data. These biases can result in biased or discriminatory responses. It's crucial to mitigate these biases to ensure fairness and inclusivity in the system's outputs.
Malicious Use: ChatGPT can be exploited for malicious purposes, such as generating false information, phishing attacks, or creating convincing social engineering messages. It's necessary to have safeguards in place to detect and prevent such misuse.
Inappropriate Content: Without proper filtering mechanisms, ChatGPT may generate inappropriate, offensive, or harmful content. It's crucial to implement robust content moderation and filtering mechanisms to prevent the dissemination of such content.
Manipulation and Misinformation: ChatGPT can be susceptible to manipulation, as it generates responses based on the patterns it learned from training data. This opens up the possibility of using the model to spread misinformation or propaganda. Care should be taken to ensure that the system is not used to amplify false or misleading information.
Security Vulnerabilities: Any software system, including ChatGPT, may have security vulnerabilities that could be exploited by malicious actors. It's essential to regularly update and patch the system to address potential vulnerabilities and ensure its security.

Addressing these concerns requires a combination of technical solutions, robust policies, and responsible deployment practices to mitigate risks and maximize the benefits of using ChatGPT.

Monday, June 26, 2023

A Note from your Friendly Neighborhood LSP about ChatGPT

No comments

Post a Comment