How I can provide arguments into my script if run as userdata in terraform?

I made a script that setups a LEMP stack:

#!/usr/bin/env bash

if tput colors >/dev/null 2>&1; then
  RED='\033[0;31m'
  YELLOW='\033[1;33m'
  CYAN='\033[1;35m'
  NC='\033[0m' # No Color
else
  RED=''
  GREEN=''
  YELLOW=''
  NC=''
fi

print_help() {
  echo -e "Usage: ${YELLOW}$0${NC} [options]"
  echo -e "${CYAN}Options:${NC}"
  echo "  --php_ver <version>        Specify PHP version (default is 8.2)"
  echo "  --nodb                     Do not install any database"
  echo "  --db_root_password <pass>  Set root password for the database"
  echo "  -h, --help                 Show this help message"
}

cleanup () {
  echo -e "${CYAN}Cleanup${NC}"
  apt-get autoremove && apt-get autoclean
  exit 0;
}

if [ "$EUID" -ne 0 ]; then 
  echo -e "${RED}ERROR: Run this script as root or via using sudo.${NC}"
  echo
  print_help
  exit 1;
fi

export DEBIAN_FRONTEND=noninteractive

PHP_VERSION="8.2"
DB_TYPE="mariadb"


while [ "$1" != "" ]; do
  case $1 in
    "--php_ver")
      PHP_VERSION=$2
      shift 2
      ;;
    
    "--nodb") 
        DB_TYPE="none"
        shift
        ;;

    "--db_root_password")
      DB_ROOT_PASSWORD=$2
      shift 2
      ;;

    "-h" | "--help")
      print_help
      exit 0
      ;;

    *)
      echo -e " ${RED}Invalid option: ${YELLOW}$1${NC}"
      exit 1
      ;;
  esac
done

 apt-get update &&  apt-get upgrade -y


if [ "$PHP_VERSION"  == "" ]; then
    echo -e "${RED}No php version provided defaulting into 8.2${NC}"
    PHP_VERSION="8.2"
fi

echo -e "${CYAN}PHP ${YELLOW}$PHP_VERSION${CYAN} will be installed ${NC}"

apt-get install -y nginx ca-certificates apt-transport-https software-properties-common
add-apt-repository -y ppa:ondrej/php 
apt-get update

apt-get install -y php${PHP_VERSION}-fpm  \
    php${PHP_VERSION}-mbstring \
    php${PHP_VERSION}-mysql \
    php${PHP_VERSION}-oauth \
    php${PHP_VERSION}-opcache \
    php${PHP_VERSION}-readline \
    php${PHP_VERSION}-xml

if [ "$DB_TYPE" == 'none' ];then
  echo -e  "${YELLOW}No Db support will be installed${NC}"
  cleanup
  exit 0;
fi

POOL_CONF="/etc/php/${PHP_VERSION}/fpm/pool.d/www.conf"
if [ -f "$POOL_CONF" ]; then
  echo -e "${CYAN}Configuring PHP-FPM to listen on ${YELLOW}127.0.0.1:9000${NC}"
  sed -i "s|^listen = .*|listen = 127.0.0.1:9000|" "$POOL_CONF"
  systemctl restart php${PHP_VERSION}-fpm
else
  echo -e "${RED}Failed to configure PHP-FPM: ${POOL_CONF} not found${NC}"
  exit 1
fi

echo -e "${CYAN}Configuring default Vhost${NC}"

rm -rf /var/www/html/*

echo "<?php phpinfo();" > /var/www/html/index.php
systemctl stop nginx

cat >/etc/nginx/sites-available/default <<EOL
server {
    listen 80 default_server;
    listen [::]:80 default_server;

    root /var/www/html;

    index index.php index.html index.htm index.nginx-debian.html;

    server_name _;

    location / {
        try_files $uri $uri/ =404;
    }

    location ~ \.php$ {
        include snippets/fastcgi-php.conf;
    
    
        # With php-cgi (or other tcp sockets):
        fastcgi_pass 127.0.0.1:9000;
    }

    location ~ /\.ht {
        deny all;
    }
}
EOL

systemctl start nginx

echo -e "${CYAN}Installing ${YELLOW}${DB_TYPE}${NC}"

apt-get -y install mariadb-server mariadb-client


if [ "$DB_ROOT_PASSWORD" == "" ]; then
  echo -e "${YELLOW}DB Root password is missing. skipping${NC}"
  cleanup
  exit 0;
fi

echo "${CYAN}Provisioning Root User${NC}"

# Make sure that NOBODY can access the server without a password
 mysql -e "UPDATE mysql.user SET Password = PASSWORD('${DB_ROOT_PASSWORD}') WHERE User = 'root'"
# Kill the anonymous users
 mysql -e "DROP USER ''@'localhost'"
# Because our hostname varies we'll use some Bash magic here.
 mysql -e "DROP USER ''@'$(hostname)'"
# Kill off the demo database
 mysql -e "DROP DATABASE IF EXISTS test"
# Make our changes take effect
 mysql -e "FLUSH PRIVILEGES"

As you can see I need to provide the following arguments:

--php_ver
--nodb
--db_root_password

The script is located alongside the terraform modules. Therefore how I can execute it but also provide the nessesary arguments towards the script as well:

resource "aws_instance" "instance" {
    ami="ami-0d342235295932397"
    instance_type="t3a.micro"
    key_name = "ssh_key"
    iam_instance_profile = "myInstance"

    root_block_device {
        volume_size = 30
        volume_type = "gp3"
    }

    # rest of nessesary arguments
    userdata= # execute script here
}

I want to avoid modifying the script so I can use it outside terraform as well.

Solution

As far as Terraform and AWS are concerned, user_data is literally just a bunch of bytes that get saved in the EC2 API without any specific meaning. Software running inside your EC2 instance then retrieves that data using the Instance metadata and user data API and decides for itself how to interpret that data.

For most general-purpose Linux machine images, the software handling user_data is cloud-init, and so the interpretation of your user_data content is as described in User data formats.

You are currently using the User data script interpretation, which just writes the given data to disk as an executable file and tries to execute it. In that case cloud-init doesn't pass any arguments to the script, so your current script is not suitable for use in this way.

The most flexible option is to set user_data to Cloud config data, which is a YAML format defined by cloud-init that allows describing various different actions cloud-init should take when it runs during system boot.

cloud-init has two "modules" that are potentially useful for your requirement: Write Files to write arbitrary small files into the filesystem, and Bootcmd to run an inline script once during early boot.

The following configuration generates a Cloud Config user_data that instructs cloud-init to first write the bootstrapping script to a specific location on disk, and then to run another small script that executes that file with specific arguments:

resource "aws_instance" "instance" {
  # ...

  user_data = <<-EOT
    #cloud-config
    ${yamlencode({
      write_files = [
        {
          encoding    = "b64"
          content     = filebase64("${path.module}/setup-lemp.sh")
          owner       = "root:root"
          path        = "/usr/local/bin/setup-lemp"
          permissions = "0755"
        },
      ]
      bootcmd = [
        [
          "cloud-init-per", "once", "setup-lemp",
          "/usr/local/bin/setup-lemp",
          "--db_root_password", "foo",
        ],
      ]
    })}
  EOT
}

The bootcmd module is configured to run the following command line:

cloud-init-per once setup-lemp /usr/local/bin/setup-lemp --db_root_password foo

This uses a cloud-init helper tool called cloud-init-per, which arranges for the given command to be executed later in the boot process. bootcmd actually runs before write_files in a typical cloud-init configuration, so this extra helper allows deferring the actual execution of the script until later in the process after the file should already have been written.

Some considerations and caveats to keep in mind:

EC2 treats user_data as just a regular attribute of an EC2 instance and so it can be retrieved by anyone with access to retrieve EC2 instance attributes. Therefore if you place your database root password there the password may be visible to others working in your AWS account.

An alternative that avoids this problem would be to instead write the password into a service like AWS Secrets Manager and use user_data to tell the instance how to retrieve the password, rather than specifying the password directly. However, the details of that are outside the scope of this question and answer.
The order in which cloud-init runs its modules is configurable by the person who built your AMI. I've built the above assuming the ordering given in the docs for an Ubuntu system, which results in the order bootcmd, write_files, scripts_per_once and should therefore work. If you see strange behavior and suspect the steps are running in a different order then you might need to inspect your system's actual cloud-init config to verify that it's running the modules in a suitable order.
Some of the work being done by your script -- package installation in particular -- could be dealt with using other declarative cloud-init modules rather than imperative bash scripting if you wish, which may make the result easier to debug because you can rely on cloud-init's status reporting commands. However, it would be hard to replace your entire script with cloud-init modules so in your case it may be simpler overall to let the entire problem (aside from writing the script in the first place) be dealt with in bash.