Search code examples
amazon-web-servicesamazon-dynamodb

Dynamo DB - Query GSI using composite keys: date range and a attribute


I am new to dynamodb so this might be a noob question. I have a table of financial movements of my users. I want to query by date range and a particular category so I can answer queries like "how much have I spent in [CATEGORY] this month?".

To achieve this, I created the following PK, SK and GSIs:

export class Movement implements MovementBaseAttributes {
  userPhoneNumber: string;
  messageId: string;
  amount: number;
  currency: string;
  description: string;
  date: string; // ISO 8601 format
  movementType: MovementType;
  category: string;
  createdAt: string;
  movementExpenseType?: MovementExpenseType;
  movementIncomeType?: MovementIncomeType;

  constructor(attr: Omit<MovementBaseAttributes, "createdAt">) {
    this.createdAt = getChileanDateISO();
    Object.assign(this, attr);
  }

  get PK() {
    return `MOVEMENT#${this.userPhoneNumber}`;
  }

  get SK() {
    return `MOVEMENT#${this.messageId}`;
  }

  get GS1PK() {
    return `USER#${this.userPhoneNumber}`;
  }

  get GS1SK() {
    return `DATE#${this.date}#TYPE#${this.movementType}`;
  }

  get GS2PK() {
    return `USER#${this.userPhoneNumber}`;
  }

  get GS2SK() {
    return `DATE#${this.date}#CATEGORY#${this.category}`;
  }

(... more code)

Then, I am using the following function to query the movements.

export async function getMovementsByQueryParams(
  user: UserSession,
  trace: LangfuseTraceClient,
  startDate: string,
  endDate: string,
  movementType?: string,
  category?: string
): Promise<Movement[]> {

  const startISO = formatDDMMYYYYToISO(startDate, "00:00");
  const endISO = formatDDMMYYYYToISO(endDate, "23:59");

  let keyConditionExpression =
    "GS1PK = :gs1pk AND GS1SK BETWEEN :startSk AND :endSk";

  let expressionAttributeValues: Record<string, any> = {
    ":gs1pk": `USER#${user.userPhoneNumber}`,
    ":startSk": `DATE#${startISO}`,
    ":endSk": `DATE#${endISO}`,
  };

  // Use GS1 index by default
  let indexName = "GS1";

  // If category is provided, use GS2 index
  if (category) {
    keyConditionExpression =
      "GS2PK = :gs2pk AND GS2SK BETWEEN :startSk AND :endSk";

    expressionAttributeValues = {
      ":gs2pk": `USER#${user.userPhoneNumber}`,
      ":startSk": `DATE#${startISO}#CATEGORY#${category}`,
      ":endSk": `DATE#${endISO}#CATEGORY#${category}`,
    };

    indexName = "GS2";
  } else if (movementType) {
    keyConditionExpression =
      "GS1PK = :gs1pk AND GS1SK BETWEEN :startSk AND :endSk";

    expressionAttributeValues = {
      ":gs1pk": `USER#${user.userPhoneNumber}`,
      ":startSk": `DATE#${startISO}#TYPE#${movementType}`,
      ":endSk": `DATE#${endISO}#TYPE#${movementType}~`,
    };
  }

  try {
    const result = await docClient.send(
      new QueryCommand({
        TableName: TABLE_MOVEMENTS_NAME,
        IndexName: indexName,
        KeyConditionExpression: keyConditionExpression,
        ExpressionAttributeValues: expressionAttributeValues,
        ScanIndexForward: false, // To get the latest movements first
      })
    );

    if (!result.Items) {
      throw new CustomError(OutputErrors.MOVEMENT_NOT_FOUND);
    }

    const movements: Movement[] = [];
    for (const item of result.Items ?? []) {
      movements.push(
        Movement.fromDynamoObject(item as MovementDynamoAttributes)
      );
    }

    if (!movements.length) {
      throw new CustomError(OutputErrors.QUERY_FOUND_NO_MOVEMENTS);
    }

    

    return movements;
  } catch (error) {
    console.error("Error fetching movements:", error);
    throw error;
  }
}

It works fine when the category is not passed. However, when I want to filter by date range and category, it returns all the movements.

Is there a way to achieve this? I do not want to retrieve all the movements and then filter by category in my function since it would be inefficient.

Here are the KeyConditionExpression

GS2PK = :gs2pk AND GS2SK BETWEEN :startSk AND :endSk

and ExpressionAttributeValues I have tried:

{
    ":endSk": "DATE#2024-07-11T23:59:00.000Z#CATEGORY#ALIMENTACION~",
    ":gs2pk": "USER#56963186163",
    ":startSk": "DATE#2024-07-01T00:00:00.000Z#CATEGORY#ALIMENTACION"
}

Solution

  • Strings are ordered lexicographically, and because you start with the date, you will get all of the categories back.

    You need to prepend the date with the type:

      get GS2SK() {
        return `CATEGORY#${this.category}#DATE#${this.date}`;
      }
    

    This will fulfil your use case as it conforms to the way strings are ordered.