Saturday, April 12, 2014

NTILE vs WIDTH_BUCKET

Rick Cale recently asked a question on an article I wrote two years ago about Some Interesting Oracle Analytic Functions. His question (see comments on Apr 11, 2014) were about the results from the NTILE function and how the same value could be in two different buckets. It was an excellent question and one that got me digging a bit more into the functionality of NTILE.

They're two ways to handle "bucketing data" in Oracle. In the documentation Oracle describes the two types as either having equiwidth or or equiheight histograms.

Equiwidth (NTILE): Each bucket will have the same number of items in it with some buckets having at most 1 more item than other buckets. An easy way to think of this concept is to order all the items, then divide the data evenly into groups based on the number of buckets. For example, supposed you have 9 values (AAAABCDEF) and wanted to put them into 3 buckets. the buckets would be B1 = AAA, B2 = ABC, B3 = DEF. You'll notice that the value "A" is in both buckets B1 and B2.

When an uneven amount of objects need to go into the buckets, NTILE will fill the first bucket first, second bucket second, etc. For example, suppose you had 10 values (AAAABCDEFG) and wanted to fill them into 3 buckets. (Note this is similar to the previous example with an additional "G"). The buckets would be B1 = AAAA, B2 = BCD, B3 = EFG.

Equiheight (WIDTH_BUCKET): This will take the min and max range, divided by the number of buckets and place each value in it. For example, the salaries in the EMP table range from 801 ~ 5000. If you set the min/max range from 0~5000 3 buckets will be created. All values from 0~1,666 will go into B1, values from 1,6667~3333 into B2, and values 3334 to 5000 into B3. (Note: for simplicity I took out decimals in this split). 


An easy way to think of these two bucketing methods is that NTILE divides values based on the number of items. WIDTH_BUCKET divides values based on their values.

Here's an example that highlights the difference between the two functions. 
with data as (
  -- using this as data input
  select 3 as num_buckets
  from dual
)
select 
  d.dname, 
  e.ename, 
  e.sal,
  ntile (data.num_buckets) over (order by sal asc) ntile,
  width_bucket(sal, 0, max(sal+1) over (), data.num_buckets) width_bucket
from emp e, dept d, data
where e.deptno = d.deptno
order by sal;


DNAME          ENAME      SAL        NTILE      WIDTH_BUCKET
-------------- ---------- ---------- ---------- ------------
RESEARCH       SMITH         801          1        1
SALES          JAMES         950          1        1
RESEARCH       ADAMS        1100          1        1
SALES          WARD         1250          1        1
SALES          MARTIN       1250          1        1
ACCOUNTING     MILLER       1300          2        1
SALES          TURNER       1500          2        1
SALES          ALLEN        1600          2        1
ACCOUNTING     CLARK        2450          2        2
RESEARCH       JONES        2975          2        2
RESEARCH       SCOTT        3000          3        2
RESEARCH       FORD         3000          3        2
SALES          BLAKE        3850          3        3
ACCOUNTING     KING         5000          3        3
It's important to note that WIDTH_BUCKET is not an analytic function but NTILE is. For more information read the documentation for each function. For WIDTH_BUCKET, the documentation covers what happens with values outside the min/max range (they go into bucket 0 and num_buckets+1).

Monday, March 31, 2014

Running Custom Code for Tabular Forms (Part 1)

One of my biggest pet peeves with Tabular Forms in APEX is that it would only run basic (Insert, Update, Delete) DML functions against a table. This works really well for basic situations but more often than not data must be processed by a procedure to handle all the business logic. For this reason, I've avoided Tabular Forms for a very long time.

Last week at OGh APEX World, Dimitri Gielis showed how you can run your own procedure against a tabular form. Here's how to do it:

First, create a Tabular Form using the standard wizard. This will create the standard validations and processes for the page. Their will be two Page Processing processes as shown below.

These processes will automatically handle any of the data changes that a user makes in the Tabular Form. For now you can go ahead and remove each process as we'll use a custom procedure to process the data instead.

Next, create a new process. The important part comes when creating the process; be sure to select the Tabular Form option (as shown below).

In your PL/SQL block, you can now reference each of the columns using their column names (example :SAL and :ENAME). What's even better is that APEX will only run the code against rows that have changed which can save a lot of processing time. For example, if only two rows in a 15 row table were changed the code will be executed twice.

In the next article I'll show how to expand this functionality beyond a base table and use this new technique to modify any data set.

Sunday, March 16, 2014

oracleapex.com (part 2)

Back in December I blogged about the new domain: oracleapex.com. I used the domain to host shortcuts to common links for APEX. One example that I use on a regular basis is api.oracleapex.com which points to the latest API APEX documentation.

Initially I used the blog post to host the list of available shortcuts. This wasn't the best solution but it allowed me to quickly inform everyone of the URLs. I have since created an application (built using APEX of course) to list all the redirect shortcuts on oracleapex.com.

Check it out and let me know what you think. Of course if you'd like a specific URL please let me know either by leaving a comment or tweeting me at @martindsouza

Thursday, March 13, 2014

APEX 5.0 and KScope 14

A few weeks ago the Oracle APEX team released APEX 5.0 EA1 (Early Adopter). You can try the new version of APEX at ea.oracleapex.com.

I expect that the APEX team will release an EA2 and possible EA3 version before officially launching APEX 5.0 with the final release coming around June. Just to be clear, these are my personal estimates not any inside information.

Releasing APEX 5.0 in June would be perfect timing as it coincides with Kscope 14. The conference has a lot of talks specifically focused on APEX 5.0 from both the Oracle development team and some of the worlds leading experts on all things APEX. I think that this conference will be a great opportunity for people (both developers and managers) to learn about all the new features and get hands on experience with this latest version.

One of the unique aspects of Kscope is that you'll be able to easily talk to the Oracle APEX team, Oracle ACE and ACE Directors, and other experts in the community. These are the people that write the blogs, read the books, and do the webinars that you may have read/seen. You can sit down with them and get their advice on how to integrate 5.0 and leverage it to enhance your current applications and increase ROI.

I'll be giving a talk at Kscope 14 called "APEX 5.0 - New Features Highlights". It will cover some of the new aspects of APEX that will help both developers and the business. You can see all the Kscope 14 APEX abstracts here. If you're thinking of coming to the conference you should register soon as Early Bird rates end on March 25th.

Hope to see you at Kscope 14!

Wednesday, March 5, 2014

Referencing USER in APEX


It’s not uncommon to reference the current user as USER in your pl/sql code. A simple use case may be to determine the client or environment that you’re running in (ex: dev, test, prod).

Referencing USER will have some slight side effects when running the code in APEX as the current USER is actually APEX_PUBLIC_USER (or what ever user you configured). This can cause issues in your application. To resolve it, simply reference sys_context('userenv','current_schema’) instead.

Example:
-- Via SQL*Plus
select user, sys_context('userenv','current_schema')
from dual;

USER        SYS_CONTEXT('USERENV
---------- --------------------
GIFFY        GIFFY
If you run the same query in APEX the output is shown below. You'll notice that referencing USER it does not display my current schema (GIFFY in this case).
The same applies to compiled code executed from APEX. For example if you have a procedure that references USER and that procedure is run from APEX then USER will be APEX_PUBLIC_USER.

This can be really tough to detect in automated tests as when testing via SQL*Plus, USER will return the current schema name.