Search code examples
pythonibm-clouddata-science-experience

ValueError: Invalid endpoint: s3-api.xxxx.objectstorage.service.networklayer.com


I'm trying to access a csv file in my Watson Data Platform catalog. I used the code generation functionality from my DSX notebook: Insert to code > Insert StreamingBody object.

The generated code was:

import os
import types
import pandas as pd
import boto3

def __iter__(self): return 0

# @hidden_cell
# The following code accesses a file in your IBM Cloud Object Storage. It includes your credentials.
# You might want to remove those credentials before you share your notebook.

os.environ['AWS_ACCESS_KEY_ID'] = '******'
os.environ['AWS_SECRET_ACCESS_KEY'] = '******'
endpoint = 's3-api.us-geo.objectstorage.softlayer.net'

bucket = 'catalog-test'

cos_12345 = boto3.resource('s3', endpoint_url=endpoint)
body = cos_12345.Object(bucket,'my.csv').get()['Body']

# add missing __iter__ method so pandas accepts body as file-like object
if not hasattr(body, "__iter__"): body.__iter__ = types.MethodType(__iter__, body)

df_data_2 = pd.read_csv(body)
df_data_2.head()

When I try to run this code, I get:

/usr/local/src/conda3_runtime.v27/4.1.1/lib/python3.5/site-packages/botocore/endpoint.py in create_endpoint(self, service_model, region_name, endpoint_url, verify, response_parser_factory, timeout, max_pool_connections)
    270         if not is_valid_endpoint_url(endpoint_url):
    271 
--> 272             raise ValueError("Invalid endpoint: %s" % endpoint_url)
    273         return Endpoint(
    274             endpoint_url,

ValueError: Invalid endpoint: s3-api.us-geo.objectstorage.service.networklayer.com

What is strange is that if I generate the code for SparkSession setup instead, the same endpoint is used but the spark code runs ok.

How can I fix this issue?


I'm presuming the same issue will be encountered for the other Softlayer endpoints so I'm listing them here as well to ensure this question is also applicable for the other softlayer locations:

  • s3-api.us-geo.objectstorage.softlayer.net
  • s3-api.dal-us-geo.objectstorage.softlayer.net
  • s3-api.sjc-us-geo.objectstorage.softlayer.net
  • s3-api.wdc-us-geo.objectstorage.softlayer.net
  • s3.us-south.objectstorage.softlayer.net
  • s3.us-east.objectstorage.softlayer.net
  • s3.eu-geo.objectstorage.softlayer.net
  • s3.ams-eu-geo.objectstorage.softlayer.net
  • s3.fra-eu-geo.objectstorage.softlayer.net
  • s3.mil-eu-geo.objectstorage.softlayer.net
  • s3.eu-gb.objectstorage.softlayer.net

Solution

  • The solution was to prefix the endpoint with https://, changing from ...

    this

    endpoint = 's3-api.us-geo.objectstorage.softlayer.net'
    

    to

    endpoint = 'https://s3-api.us-geo.objectstorage.softlayer.net'