python dataframe cluster-analysis linear-regression k-means

encountering as error when trying to create a artificial dataframe in Python

This is my first post and pardon me for any misses from my end.

Was trying to create an artificial data frame to use k-means clustering. Getting this error while running the data set creating function and viewing the data frame getting error as below.

TypeError: _append_dispatcher() missing 1 required positional argument: 'values'

I would appreciate your support and help to resolve.

from scipy.stats import norm 
import random
from numpy import *
import numpy as np
from ast import literal_eval
from pandas import DataFrame
def create_clustered_data(N,k):
    random.seed(10)
    points_per_cluster=float(N)/k
    x=[]
    
    for i in range(k):
        income_centroid=random.uniform(20000,200000)
        age_centroid=random.uniform(20,70)
        for j in range(int(points_per_cluster)):
            x=np.append([random.normal(income_centroid,10000),random.normal(age_centroid,2)])
        x=np.array(x)
    return(x)

df=create_clustered_data(100,5)
df

Error Message

TypeError                                 Traceback (most recent call last)
<ipython-input-204-0ff0b56b46c6> in <module>
     18     return(x)
     19 
---> 20 df=create_clustered_data(100,5)
     21 df
     22 

<ipython-input-204-0ff0b56b46c6> in create_clustered_data(N, k)
     14         age_centroid=random.uniform(20,70)
     15         for j in range(int(points_per_cluster)):
---> 16             x=np.append([random.normal(income_centroid,10000),random.normal(age_centroid,2)])
     17         x=np.array(x)
     18     return(x)

<__array_function__ internals> in append(*args, **kwargs)

TypeError: _append_dispatcher() missing 1 required positional argument: 'values'

Solution

Here x=[] creates a list, not a numpy array also the check the syntax of the numpy append function. One way to solve the problem would be to append it to the list using the list.append function and then convert the list to a numpy array.

from scipy.stats import norm 
import random
from numpy import *
import numpy as np
from ast import literal_eval
from pandas import DataFrame

def create_clustered_data(N,k):
    random.seed(10)
    points_per_cluster=float(N)/k
    x=[]
    for i in range(k):
        income_centroid=random.uniform(20000,200000)
        age_centroid=random.uniform(20,70)
        for j in range(int(points_per_cluster)):
            x.append([random.normal(income_centroid,10000),random.normal(age_centroid,2)])
        ar = np.array(x) 
    return(ar)

df=create_clustered_data(100,5)
df