SQL DB2: Concatenate TEXT from multiple records

 

Concatenation Road in Winter

 

Scenario

 

Here is a recipe on how to concatenate a field from multiple records into one field.

There are two pre-conditions for this method to work:

First.

Your source file needs to have an equivalent of a line number field.  This should be a numeric field indicating the sequence in which to concatenate the records.  This field doesn’t not need to be  consecutive.

Second.

The difference between lowest and highest line numbers should be a reasonably low number.  In this example the max difference is 11, and it means that at the very maximum we would need to concatenate 11 text fields together.

 

Step One

Generate a driver file with MIN line number, MAX line number and the Difference:

  
INSERT INTO F594023L21
SELECT DISTINCT A3AN8 AS TXAN8 , MIN( A3LIN ) AS MIN , MAX ( A3LIN
) AS MAX, COUNT(*) AS COUNT , MAX ( A3LIN ) - MIN ( A3LIN ) + 100
AS DIF FROM F11193 WHERE A3AN8 IN (SELECT ABAN8 FROM
F594023DL1) AND A3DS80 <> '' AND A3TYDT = 'QN' GROUP BY

 

The resulting file would look like this:

   TXAN8      TXMIN       TXMAX         TXCOUNT             TXDIF
     264        100         100               1               100
   51520        100         100               1               100
   92706        100         200               2               200
    5521        100         200               2               200
     552        100         200               2               200
  305667        100         200               2               200
 2305394        100         200               2               200
    5597        100         200               2               200
    2634      1,100       1,200               2               200
    2696        100         200               2               200
    6720        100       1,100              10             1,100
    8036        100         300               3               300

This file contains a key filed, MIN and MAX line numbers, record count and the difference between the MAX and MIN.

We will use this file to start concatenate from TXMIN line number to the highest difference number between MAX and MIN (TXDIF).  In this case we determined that the highest difference between MAX and MIN line numbers in this file is 11 (shown above as 1,100).

 

Step Two

Here we actually concatenate.  Wee  JOIN our file (file that contains TEXT field that we need to concatenate) to itself 11 times.  Each new JOIN is bringing a next records with next line number to concatenate.

INSERT INTO F594023L22
SELECT TXAN8, REPLACE( REPLACE(REPLACE(
COALESCE( TRIM( A1.A3DS80) , '') || ' ' ||
COALESCE( TRIM( A2.A3DS80) , '') || ' ' ||
COALESCE( TRIM( A3.A3DS80) , '') || ' ' ||
COALESCE( TRIM( A4.A3DS80) , '') || ' ' ||
COALESCE( TRIM( A5.A3DS80) , '') || ' ' ||
COALESCE( TRIM( A6.A3DS80) , '') || ' ' ||
COALESCE( TRIM( A7.A3DS80) , '') || ' ' ||
COALESCE( TRIM( A8.A3DS80) , '') || ' ' ||
COALESCE( TRIM( A9.A3DS80) , '') || ' ' ||
COALESCE( TRIM( AA.A3DS80) , '') || ' ' ||
COALESCE( TRIM( AB.A3DS80) , '')
, '|', ' '), X'0D', ' '), X'25', ' ') AS A3DS80
FROM F594023L21 A0
JOIN F11193 A1 ON A0.TXAN8 = A1.A3AN8 AND A1.A3LIN =
TXMIN        AND A1.A3TYDT = 'AA'
LEFT OUTER JOIN F11193 A2 ON A1.A3AN8 = A2.A3AN8 AND A2.A3LIN =
TXMIN +  100 AND A2.A3TYDT = 'AA'
LEFT OUTER JOIN F11193 A3 ON A1.A3AN8 = A3.A3AN8 AND A3.A3LIN =
TXMIN +  200 AND A3.A3TYDT = 'AA'
LEFT OUTER JOIN F11193 A4 ON A1.A3AN8 = A4.A3AN8 AND A4.A3LIN =
TXMIN +  300 AND A4.A3TYDT = 'AA'
LEFT OUTER JOIN F11193 A5 ON A1.A3AN8 = A5.A3AN8 AND A5.A3LIN =
TXMIN +  400 AND A5.A3TYDT = 'AA'
LEFT OUTER JOIN F11193 A6 ON A1.A3AN8 = A6.A3AN8 AND A6.A3LIN =
TXMIN +  500 AND A6.A3TYDT = 'AA'
LEFT OUTER JOIN F11193 A7 ON A1.A3AN8 = A7.A3AN8 AND A7.A3LIN =
TXMIN +  600 AND A7.A3TYDT = 'AA'
LEFT OUTER JOIN F11193 A8 ON A1.A3AN8 = A8.A3AN8 AND A8.A3LIN =
TXMIN +  700 AND A8.A3TYDT = 'AA'
LEFT OUTER JOIN F11193 A9 ON A1.A3AN8 = A9.A3AN8 AND A9.A3LIN =
TXMIN +  800 AND A9.A3TYDT = 'AA'
LEFT OUTER JOIN F11193 AA ON A1.A3AN8 = AA.A3AN8 AND AA.A3LIN =
TXMIN +  900 AND AA.A3TYDT = 'AA'
LEFT OUTER JOIN F11193 AB ON A1.A3AN8 = AB.A3AN8 AND AB.A3LIN =
TXMIN + 1000 AND AB.A3TYDT = 'AA'

 

 

This long SQL statement consists of two parts.

In the SELECT clause we concatenate the text field from all 11 JOIN files into one string.  We need to use function COALESCE because concatenate with NULL is always NULL.  Additional features here are
(1) separating each new string from next string with a space    –
. . . ) || ‘ ‘ ||
and
(2) removing special characters from the resulting sting  –
REPLACE( REPLACE(REPLACE( . . . , ‘|’, ‘ ‘), X’0D’, ‘ ‘), X’25’, ‘ ‘).

In the FROM clause we JOIN file to itself starting with TXMIN value and increasing it value  by one (or in this case by 100) to bring the next record to concatenate.

TXAN8     A3DS80                                  
     513  Yes! This is a very long TEXT field . . 
      89  Changed from RR to EM per F. Goldini 11-
    6308  Changed sales level to RF4  AG9100 
    5618  Supplier added to the FASL as ZZ to supp
      31  Per email dated 2012-07-27, from Jay F, Gas
     917  Added as SI 2011-11-09 MTV  . .         
     647  ADD AS EM FOR MAGNETICS/TRANSFORMERS FOR

The resulting output contains 2 fields: a key field and a very long text field.  If you to export this  into a file, the actual length of this field will be the length of the original TEXT field multiplied on the number of times you join the file to itself.  In this case, the resulting field length is 80 x 11 = 880.

This method is test twice on DB2.  It would be interesting to hear, if the same technique works on other platforms.

(Visited 723 times, 1 visits today)

Be the first to comment

Your question, correction or clarification Ваш вопрос, поправка или уточнение