Oracle EXADATA Optimization

I am working on Oracle EXADATA data server with 52 CPU, but when I query a table with 9 million records on a full table scan with parallel hint of 128, the output of the query takes more than 20 minutes to fetch the data with no utilization in the DB level. what can be improved in this query?

The parallel_max_servers parameters are as under:

Current_utilization: 432
Max Utilization: 1040
Initial Allocation: 1040
Limit Value: 32767

Query used is as under:

Alter session force parallel;

Select /* +PARALLEL(128) */ * FROM TABLE;

Tried as mentioned in the problem statement

Solution

"with no utilization in the DB level" - this is the key piece of information in your question. This means that nearly all your time is being spent either on the network or on the client machine processing the incoming data (e.g. writing it to a file maybe or whatever else it does). This makes this not a database issue. There is no query tuning that can help, because the query itself is providing you data as fast as you can consume it. The issue is you can't consume it very fast, but that's not the database's fault. The fact that this is Exadata and has a bunch of CPUs doesn't help you in this situation - there's literally nothing for all that power to do. A serial direct path full table scan in non-Exadata with just one or two CPUs available would be just as fast. For this reason, don't try to ask for parallelism to solve this, it won't help. Parallelism and Exadata features kick in usefully when you have some query complexity - a filter predicate, a join, a sort, a limited column list, almost anything beyond select * from table.

Here are some things to check.

Ensure your fetch size is in the thousands at least. A fetch size of 1 or 10 or 100 can cause too many network round trips, and if your latency is in the tens of ms or higher, this can make pulling millions of rows very inefficient. Increase the fetch size to something respectable. Check your network latency (use ping) to ensure it's reasonable. You can determine your fetch size by finding your SQL in the shared pool (v$sql) and dividing rows_processed by fetches.
Your rows are extraordinarily wide. If each row has thousands of bytes, 9 million rows will take a lot more work to fetch over the network and write to disk than 9 million narrow rows of only a few dozen bytes each. Consider omitting overly large columns that you may not need.
Your application is processing the rows slowly. Is it sorting it? Looping on the rows and doing something? Tune the application.
Your application is writing rows to a file slowly. The filesystem to which you are writing may be slow (NFS to some distant network, for example). Have your OS system admin or storage team take a look at this. If the writing process is I/O bound (not CPU bound), consider inline compression so fewer bytes are written to disk.

Lastly, once you've looked at all these options and still need the pull to go faster, you may need to establish multiple concurrent connections that each pull a portion of the table. For example, if you had 10 processes selecting select * from table where MOD(pkcol,10)=:thread-1 and they bind their thread number, they will each pull 1/10th of the table. While that does mean 10x more work on the database side, it will overcome limitations due to the network, application processing of rows, and to some extent help with writing to files by using multiple CPUs to do so. If this helps but your DBA complains about the load (this can cause excessive CPU on the Exadata storage cells so should not be overused), then consider partitioning the table and having individual concurrent processes pull individual partitions. This will reduce the load on the storage cells to an acceptable level.