Skip to content

Conversation

@SarahAsad23
Copy link
Contributor

@SarahAsad23 SarahAsad23 commented Jun 28, 2025

This PR adds an option for case sensitivity to the keyword search operator. Users can now use a checkbox to specify whether their search should be case sensitive or case insensitive.

This functionality is enabled through the addition of a CaseSensitiveAnalyzer that extends the base Lucene Analyzer for case sensitive searches, while the original StandardAnalyzer is used for case insensitive searches.

Examples:
caseSensitive
CaseInsensitive1
CaseInsensitive2

Dataset for Testing:
keyword_search_test_dataset.csv

Copy link
Contributor

@bobbai00 bobbai00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments

import org.apache.lucene.analysis.Analyzer.TokenStreamComponents

class CaseSensitiveAnalyzer extends Analyzer {
override protected def createComponents(fieldName: String): TokenStreamComponents = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some comments to explain the purpose of this class. And how you set it to make it CaseSensitive?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the KeywordSearchOpDesc, the comment mentions about the balanced performance and wide range of supported tokens for the StandardAnalyzer. How is the CaseSensitiveAnalyzer compared to StandardAnalyzer in terms of those aspects?

@SarahAsad23 SarahAsad23 changed the title Add case sensitivity to keyword search Operator Feat(Operator): Add case sensitivity to keyword search Operator Jul 8, 2025
@SarahAsad23 SarahAsad23 changed the title Feat(Operator): Add case sensitivity to keyword search Operator feat(operator): Add case sensitivity to keyword search Operator Jul 8, 2025
@github-actions github-actions bot added backend Anything related to backend services and removed feature fix labels Oct 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend Anything related to backend services

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants