Terms and Concepts

This page defines the common terminology and concepts used by HoundDog.ai. Familiarizing yourself with these terms will help you better understand the features and capabilities of our code scanner.

Data Elements

Sensitive Data Elements (or simply Data Elements) refer to any information within a codebase that is considered confidential, private, or critical. They often require special handling - such as masking and encryption - to prevent unauthorized exposure.

Categories

Below are some of the most common categories of sensitive data elements:

CategoryDescription
PII (Personally Identifiable Information)Data that can be used to identify an individual, such as full names, physical addresses, email addresses, dates of birth, and Social Security Numbers.
PIFI (Personally Identifiable Financial Information)A subset of PII focused on financial data, including credit card numbers, bank account details, and payment history.
PHI (Protected Health Information)Medical records, insurance account information, or any data related to an individual's health status, as defined by regulations like HIPAA.

To ensure privacy and security, it’s essential to identify, classify, and safeguard sensitive data elements such as PII within your codebase.

Sensitivity Levels

Sensitive data elements are categorized into three sensitivity levels: Critical, Medium, and Low. Below are examples for each level.

Sensitivity LevelExamples
Critical
  • Social Security Numbers (SSNs)
  • Bank account details
  • Medical patient history
Medium
  • Email addresses
  • Usernames
  • IP addresses
Low
  • Dates of birth
  • Gender
  • First and last names

Data Element Definitions

Sensitive Data Element Definitions (or simply Data Element Definitions) are predefined sets of conditions and match patterns used to detect identifiers in code such as class names, function names, and variable names that strongly suggest they handle sensitive data (e.g., User.lastName, get_ssn). HoundDog.ai continuously evolves and curates this collection through advanced workflows and real-world testing.

Vulnerabilities

Vulnerabilities are flaws or weaknesses in software that can be exploited by attackers to gain unauthorized access to sensitive data elements, such as PII. They often stem from design flaws, coding errors, or configuration mistakes.

HoundDog.ai also classifies unauthorized or unintended dataflows as vulnerabilities, as they can pose significant privacy and security risks.

Dataflows

Dataflows refer to the movement of sensitive data elements such as PII, PIFI, or PHI through various parts of a codebase, particularly when passed into data sinks. Understanding and monitoring potentially vulnerable dataflows is crucial for identifying where sensitive information might be exposed, mishandled, or insufficiently protected.

Data Sinks

Data sinks are endpoints or destinations where data leaves its original context - such as logs, cookies, JWTs, external APIs, databases, third-party services, or user interfaces. Common examples include Datadog, AWS S3, PostgreSQL, Sentry, Firebase, and Snowflake. These are often the final stop for data before it's stored, transmitted, or displayed, making them critical points for enforcing security and privacy controls.

While some data sinks offer PII scrubbing capabilities, they typically rely on sampling and broad pattern matching that lack the specificity or flexibility needed for different use cases. Scrubbing at the sink is also reactive and often costly, making it an inefficient last line of defense. HoundDog.ai takes a proactive approach by detecting risky dataflows early in the development cycle, long before any changes reach production.

CWEs

HoundDog.ai's mission is to shift-left and empower organizations to prevent and eliminate vulnerabilities at the source code level, with primary focus on PII (Personally Identifiable Information). Here are some of the most common CWEs (common weakness numerations) our scanner covers extensively:

CWEDescription
CWE-201Information Exposure Through Sent Data
CWE-209Information Exposure Through an Error Message
CWE-312Cleartext Storage of Sensitive Information
CWE-313Cleartext Storage in a File or on Disk
CWE-315Cleartext Storage of Sensitive Information in a Cookie
CWE-532Insertion of Sensitive Information into Log File
CWE-539Use of Persistent Cookies Containing Sensitive Information

Severity Levels

HoundDog.ai associates each vulnerability (i.e. vulnerable dataflow) with one or more data elements and classifies them into three severity levels: Critical, Medium, and Low. The severity level of a vulnerability is determined by the highest sensitivity level of its data elements associated with the vulnerability.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard