Search code examples
awksedgrep

Replace the last half in every line of a file with the last half of corresponding line in another file


I have two files A and B. Every line in both files is considered an item. The format of every item is fixed, consisting of a key and description, separated by a space. as shown in the example below.

UASPCH-XCF02-XXB1CF02-UACF02-ih_CW100M2_0000027_0000104 /users/documents/ark

The first part, UASPCH-XCF02-XXB1CF02-UACF02-ih_CW100M2_0000027_0000104, is the key, and the last part,/users/documents/ark, is the description. File A and B have 1000 and 100000 items, respectively. Every key can be divided into two parts: index (for example: UASPCH-XCF02-XXB1CF02-UACF02-ih_CW100M2) and time stamp (for example: 0000027_0000104) in our example. There is no rule about digits in the time stamp. The character between index and time stamp is fixed as _. Every key is unique, and every index is also unique in the same file. Every index included in file A also occurred in file B with a different time stamp. As shown in a simple example below.

File A

UASPCH-XCF02-SP062-XXB2CF02-UACF02-ih_CW100M2_0000000_0000119 /users/documents/ark1
UASPCH-XCF02-XXB1CF02-UACF02-ih_CW100M2_0000027_0000104 /users/documents/ark2

File B

UASPCH-XCF02-SP062-XXB2CF02-UACF02-ih_CW100M2_0000002_0000118 /users/documents/ark3
UASPCH-XCF02-XXB1CF02-UACF02-ih_CW100M2_0000026_0000107 /users/documents/ark4
UASPCH-XXM16-XXXB1M16-XUAM16-ih_CW100M3_0000039_0000129 /users/documents/ark5

I want to replace the description corresponding to the same index in file B with the description corresponding to the index in file A. The result in the example is shown below.

File B

UASPCH-XCF02-SP062-XXB2CF02-UACF02-ih_CW100M2_0000002_0000118 /users/documents/ark1
UASPCH-XCF02-XXB1CF02-UACF02-ih_CW100M2_0000026_0000107 /users/documents/ark2
UASPCH-XXM16-XXXB1M16-XUAM16-ih_CW100M3_0000039_0000129 /users/documents/ark5

How to achieve this target?


Solution

  • awk '
        NR==FNR{
            split($1,p,"_")
            a[p[1]"_"p[2]] = $NF
            next
        } 
        split($1,b,"_") && (b[1]"_"b[2] in a){ 
            $NF = a[b[1]"_"b[2]]
        }1
    ' FileA FileB
    
    UASPCH-XCF02-SP062-XXB2CF02-UACF02-ih_CW100M2_0000002_0000118 /users/documents/ark1
    UASPCH-XCF02-XXB1CF02-UACF02-ih_CW100M2_0000026_0000107 /users/documents/ark2
    UASPCH-XXM16-XXXB1M16-XUAM16-ih_CW100M3_0000039_0000129 /users/documents/ark5