Free-text fields are both a blessing and a curse.
They give users flexibility, but from a reporting point of view they’re often messy, inconsistent, and hard to work with. In this case, a customer had a Comments field where users manually typed information, but inside that text was a specific code they needed for reporting.
The requirement was simple on paper:
Extract a code from a free-text comments field.
In reality… not so simple
Let’s walk through how regular expressions in Qlik can solve this cleanly and reliably.
The Problem
The source field looked something like this:
"Breakdown reported – REF12345 vehicle unavailable"
"Job completed ABC 67890 no further action"
"Awaiting parts – no reference provided"
The rules for the code were consistent:
- 3 uppercase letters
- Optional space
- 5 digits
Examples of valid codes:
ABC12345ABC 12345
The challenge:
- The code could appear anywhere in the text
- Some rows didn’t contain a code at all
- We needed a clean output suitable for filtering and analysis
Why Regular Expressions?
You could try to solve this with string functions like Mid(), Left(), or Index(), but that approach quickly falls apart when:
- The position of the code varies
- The spacing isn’t consistent
- The text contains lots of noise
Regular expressions (regex) are designed for exactly this kind of pattern matching.
The Solution
This is the expression I used:
Coalesce(
Replace(
ExtractRegEx(
[Comments],
'[A-Z]{3}\s?\d{5}',
1
),
' ',
''
),
'NOT FOUND'
)
Let’s break it down.
Step 1: Extract the Code with ExtractRegEx
ExtractRegEx(
[Comments],
'[A-Z]{3}\s?\d{5}',
1
)
The regex pattern:
-
[A-Z]{3}→ exactly 3 uppercase letters -
\s?→ optional space -
\d{5}→ exactly 5 digits
This tells Qlik:
“Find me the first occurrence of three letters followed by five numbers, with or without a space.”
If no match is found, ExtractRegEx() returns NULL.
Step 2: Remove the Space (If It Exists)
Replace(…, ' ', '')
If the original text contained ABC 12345, this step converts it to:
ABC12345
This ensures the output is always consistent, which is crucial for joins, filters, and comparisons later on.
Step 3: Handle Missing Codes with Coalesce
Coalesce(…, 'NOT FOUND')
When no matching code exists, ExtractRegEx() returns NULL. Wrapping everything in Coalesce() lets us replace those NULLs with something meaningful:
NOT FOUND
This makes the result:
- Easier to understand for users
- Easier to filter in the app
- Safer for downstream logic
Final Result
| Comments Text | Extracted Code |
|---|---|
| Breakdown reported – REF12345 vehicle unavailable | REF12345 |
| Job completed ABC 67890 no further action | ABC67890 |
| Awaiting parts – no reference provided | NOT FOUND |
Clean, consistent, and analysis-ready
Why This Matters
Using regex in Qlik allows you to:
- Extract structured data from unstructured text
- Reduce manual data cleansing
- Build more reliable KPIs and dimensions
- Handle real-world, messy user input gracefully
If you work with comments, notes, descriptions, or reference fields, regex is one of the most powerful tools you can add to your Qlik toolbox.

Top comments (0)