Tech, Oracle, user experience, coffee, design standards, and shameless ranting

Oracle Pipline functions

Pipelined functions are an extremely useful feature of Oracle 9i and above. They allow you to present disperate nontable based data as a standard table whilst streaming the output so that processing can begin immediately.

I.e., they let you SELECT from a function much like you would SELECT from a table. Cool huh?
Why would you need something like this?

Let me go straight in with an example.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
CREATE OR REPLACE function fn_ea_bio_file (p_path varchar2)
    return ea_bio_table
    pipelined
    is
      l_clob clob;
      l_n    pls_integer;
      l_line varchar2(1000);
      l_site varchar2(20);
      l_sortcode varchar2(10);
      l_abundance varchar2(5);
    begin
        SELECT XDBURIType(p_path).getClob()
        into l_clob FROM DUAL;
     if substr(l_clob,-1,1) != chr(10) then l_clob := l_clob || chr(10); end if;
     l_n := instr(l_clob,chr(10));
     l_line := substr(l_clob,1,l_n)||',';
     while l_n > 0 loop
        for x in 0 .. 5
        Loop
            --dbms_output.put_line('line:'||l_line);
            l_line := replace(l_line,chr(10),'');
            l_site := trim(substr(l_line, 1, 20));
            l_sortcode := substr(l_line, 21+(10*x),9);
            l_abundance := trim(substr(l_sortcode,9,1));
            l_sortcode := trim(substr(replace(l_sortcode,',',''),1,8));
            if l_sortcode is not null then
                pipe row(ea_bio(l_site,l_sortcode, l_abundance));
            end if;
            End Loop;
 
       l_clob := substr(l_clob,l_n+1);
       l_n := instr(l_clob,chr(10));
       l_line := substr(l_clob,1,l_n)||',';
     end loop;
   end;

Here. im reading a CSV file from the local XMLDB repository, but I could just as easily read a CSV from a local directory or a URL somewhere on the net.

I need to present the CSV data as a table so that I can query, group, sum count etc, basically manipulate and select as I like. Normally, I might read the entire CSV in, and store it in a temporary table. This approach can be very messy, i’d need to clean up afterwards, it would take time to read and process large files, i’d need to include some concurrent user functionality, so i’d have to store a user_id and a concept of a run (a sequence) number. I just want to select from a CSV file, is this so hard? Nope it isn’t. Read on.

In this code, i’m looping through a CSV file one line at a time.
Next, i’m setting some column values that I want to output.
Finally, i’m piping the columns out of the function.

I select from this function using:

1
2
Select * from table(fn_ea_bio_file(
        p_source_folder||'/'||p_source_filename))

What could be easier. The result look like they come from a table or view, and I can point the query to any source file I like (no setting up directories in advance or any nonsense like that).

I can join this CSV file to other table and basically treat it as I would any other internal data source.

The function returns rows as it processes them. Thats why it’s called a pipeline function. It will spit out data whilst it is still consuming and processing the remaining data. It streams it as it gets it.

Share

You can follow any responses to this entry through the RSS 2.0 feed.

Switch to our mobile site