EDIT: I solved some classpath issues and now I get the following error in the log file that's generated when performing "SELECT *"
UP-04004: error reading file /home/oracle/tweet-dataloc/nosql.dat
KUP-04017: Operating system message: Error 0
KUP-04017: operating system message: /home/oracle/processor/nosql_stream: line 4: java: No such file or directory
Note that that file is EXACTLY in that path, and has 777 permissions. End of edit
I am making an external table pointing to an Oracle NoSQL database where the data is stored as key-value pairs.
NOTE that for this example, I am placing every directory in /tmp which all users at OS level can access, so we take away any OS permission related problems
First, I create the two virtual directories in SQL developer, grant permissions to my user (nosqluser) and of course, create the external table:
CREATE DIRECTORY ext_tab AS '/tmp/tweet-dataloc';
CREATE DIRECTORY nosql_bin_dir AS '/tmp/processor';
GRANT READ, WRITE ON DIRECTORY ext_tab TO nosqluser;
GRANT READ, EXECUTE ON DIRECTORY nosql_bin_dir TO nosqluser;
And then, I create the table as follows:
CREATE TABLE "NOSQLUSER"."TWEETS3"
("CREATED_AT" VARCHAR2(80 BYTE),
"ID_STR" VARCHAR2(80 BYTE),
"TEXT" VARCHAR2(200 BYTE),
"NAME" VARCHAR2(80 BYTE),
"LOCATION" VARCHAR2(80 BYTE),
"VERIFIED" VARCHAR2(80 BYTE),
"FOLLOWERS_COUNT" NUMBER,
"FRIENDS_COUNT" NUMBER,
"LISTED_COUNT" NUMBER,
"FAVOURITES_COUNT" NUMBER,
"STATUSES_COUNT" NUMBER,
"CREATED_AT_USER" VARCHAR2(80 BYTE),
"COUNTRY" VARCHAR2(80 BYTE),
"COUNTRY_CODE" VARCHAR2(80 BYTE),
"FULL_NAME_PLACE" VARCHAR2(80 BYTE),
"NAME_PLACE" VARCHAR2(80 BYTE),
"PLACE_TYPE" VARCHAR2(80 BYTE),
"IS_QUOTE_STATUS" VARCHAR2(80 BYTE),
"QUOTE_COUNT" NUMBER,
"REPLY_COUNT" NUMBER,
"FAVORITE_COUNT" NUMBER,
"RETWEET_COUNT" NUMBER,
"FAVORITED" VARCHAR2(80 BYTE),
"RETWEETED" VARCHAR2(80 BYTE),
"FILTER_LEVEL" VARCHAR2(80 BYTE),
"LANG" VARCHAR2(80 BYTE),
"TIMESTAMP_MS" VARCHAR2(80 BYTE)
)
ORGANIZATION EXTERNAL(
TYPE ORACLE_LOADER
DEFAULT DIRECTORY "EXT_TAB2"
ACCESS PARAMETERS(
records delimited by newline
preprocessor nosql_bin_dir2:'nosql_stream'
fields terminated by '|'
missing field values are null
reject rows with all null fields
)
LOCATION ('nosql.dat')
)
REJECT LIMIT UNLIMITED ;
Finally, I show you how my nosql_stream script and nosql.dat files look like: /tmp/processor/nosql_stream:
#!/bin/bash
export PATH=$PATH:/usr/java/latest/bin
export CLASSPATH=/home/oracle/processor/*
java oracle.kv.exttab.Preproc $*
/tmp/tweet-dataloc/nosql.dat:
<config version="1">
<component name="publish" type="params" validate="true">
<property name="oracle.kv.exttab.connection.url" value="jdbc:oracle:thin:/@//relacional:1521/ORCLPDB1.localdomain" type="STRING"/>
<property name="oracle.kv.exttab.connection.user" value="nosqluser" type="STRING"/>
<property name="oracle.kv.exttab.tableName" value="nosqluser.tweets2" type="STRING"/>
</component>
<component name="nosql_stream" type="params" validate="true">
<property name="oracle.kv.exttab.externalTableFileNumber" value="0" type="INT"/>
<property name="oracle.kv.exttab.totalExternalTableFiles" value="1" type="INT"/>
<property name="oracle.kv.formatterClass" value="formatter.TweetFormatter" type="STRING"/>
<property name="oracle.kv.hosts" value="bequi_kvlite_1:5000" type="STRING"/>
<property name="oracle.kv.kvstore" value="kvstore" type="STRING"/>
</component>
</config>
This last file is generated through a Publish function performed on the NoSQL Database end. You can follow the process here.
Now, the problem is even though the records are processed like you can see here:
Fri May 10 08:16:34 +0000 2019|1126762942307811331|RT @annknownityy: Future doctor, lawyer, engineer, med tech, nurse, cpa, psychologist, diplomat, biologist, teacher, architect, in the offi?|Sycamore Girl?|Caloocan City, National Capita|false|85|190|0|7804|3131|Sat Mar 21 00:09:46 +0000 2015||||||false|0|0|0|0|false|false|low|en|1557476194346
Fri May 10 08:16:34 +0000 2019|1126762943347953664|University of Ibadan (UI) School Fees Schedule for 2018/2019 Academic Session ? http somelink ? Learn More|OlusegunFapohunda|Earth|false|592|5|3|104|6851|Thu Feb 11 21:49:57 +0000 2010||||||false|0|0|0|0|false|false|low|en|1557476194594
Fri May 10 08:16:34 +0000 2019|1126762943498948609|RT @zinadabo1: Pls we need help, I was just informed that Rotimi Akeredolu increased Ondo state university tuition from 35k to150k.
The data is not accessible. When I do a "SELECT * FROM nosqluser.tweets3" what I get is:
RA-29913: error executing call from ODCIEXTTABLEFETCH
ORA-29400: error in data cartdrige
KUP-04004: error reading the file /tmp/tweet-dataloc/nosql.dat
Since the registers are shown, I know that the NoSQL database is accessible, and the data is Formatted and read correctly. In fact, up to here the process should be complete, so why can't I access my data?
The nosql_stream file must look like:
#!/bin/bash
$JAVA_HOME/bin/java oracle.kv.exttab.Preproc $*
In other words, you have to specify to the nosql_stream where Java is, even when it is already specified in the environment variables or the .bashrc file.
This explains why it works on the container where the 12c is deployed since it is a "native" Linux system and makes us of these variables; meanwhile through the local SQL Developer these can't be accessed properly, and thus the file couldn't be properly executed.