Search code examples
pythoncloudera-cdhcloudera-manager

Adding a parcel repository using Cloudera Manager Python API


I'm trying to install CDH5 parcels on Hadoop-cluster using Cloudera Manager Python API. I'm doing this using following code:

test_cluster = ... # configuring cluster
# adding hosts ...
for parcel in test_cluster.get_all_parcels():
    if parcel.product == 'CDH' and 'cdh5':
        parcel.start_download().wait()
        parcel.start_distribution().wait()
        success = parcel.activate().wait().success

But I catch such error:

cm_api.api_client.ApiException: Parcel for CDH : 5.8.0-1.cdh5.8.0.p0.42 is not available on UBUNTU_TRUSTY. (error 400)

The CDH 5.8.0-1.cdh5.8.0.p0.42 was in AVAILABLE_REMOTELY, as we can see if print a string representation on this parcel:

<ApiParcel>: CDH-5.8.0-1.cdh5.8.0.p0.42 (stage: AVAILABLE_REMOTELY) (state: None) (cluster: TestCluster)

After the execution of code, parcel changes its stage to DOWNLOADED.

It seems, I should add a new parcel repository, compatible with Ubuntu Trusty (14.04). But I don't know of doing this using Cloudera Manager API.

How I can specify the new repository for installing correct CDH?


Solution

  • You may want to be more specific about the parcel you are acting on. I use something like this for the same purpose, the important part for your question is the combined check on parcel.version and parcel.product. After that (yes I am verbose in my output) I print the list of parcels to verify I am trying to only install the 1 parcel I want.

    I'm sure you've been here, but if not the cm_api github site has some helpful examples too.

    cdh_version = "CDH5"
    cdh_version_number = "5.6.0"
    # CREATE THE LIST OF PARCELS TO BE INSTALLED (CDH)
    parcels_list = []
    for parcel in cluster.get_all_parcels():
        if parcel.version.startswith(cdh_version_number) and parcel.product == "CDH":
            parcels_list.append(parcel)
    
    for parcel in parcels_list:
        print "WILL INSTALL " + parcel.product + ' ' + parcel.version
    
    # DISTRIBUTE THE PARCELS
    print "DISTRIBUTING PARCELS..."
    for p in parcels_list:
        cmd = p.start_distribution()
        if not cmd.success:
            print "PARCEL DISTRIBUTION FAILED"
            exit(1)
    # MAKE SURE THE DISTRIBUTION FINISHES
    for p in parcels_list:
        while p.stage != "DISTRIBUTED":
            sleep(5)
            p = get_parcel(api, p.product, p.version, cluster_name)
        print p.product + ' ' + p.version + " DISTRIBUTED"
    
    # ACTIVATE THE PARCELS
    for p in parcels_list:
        cmd = p.activate()
        if not cmd.success:
            print "PARCEL ACTIVATION FAILED"
            exit(1)
    # MAKE SURE THE ACTIVATION FINISHES
    for p in parcels_list:
        while p.stage != "ACTIVATED":
            p = get_parcel(api, p.product, p.version, cluster_name)
        print p.product + ' ' + p.version + " ACTIVATED"