I have one json file which has some duplicates based on a column called userid, and want to remove those duplicates using append() function so that it can output a new file with the same format as the original one.
Here is a json file:
[
{
"userid": "7126521576",
"status": "UserStatus.OFFLINE",
"name": "Avril Pauling",
"bot": false,
"username": "None"
},
{
"userid": "7126521576",
"status": "UserStatus.OFFLINE",
"name": "Avril Pauling",
"bot": false,
"username": "None"
},
{
"userid": "6571627119",
"status": "UserStatus.OFFLINE",
"name": "Laverne Alferez",
"bot": false,
"username": "None"
},
{
"userid": "1995422560",
"status": "UserStatus.OFFLINE",
"name": "098767800",
"bot": false,
"username": "None"
}
]
The output file after removing duplicated userids should be:
[
{
"userid": "7126521576",
"status": "UserStatus.OFFLINE",
"name": "Avril Pauling",
"bot": false,
"username": "None"
},
{
"userid": "6571627119",
"status": "UserStatus.OFFLINE",
"name": "Laverne Alferez",
"bot": false,
"username": "None"
},
{
"userid": "1995422560",
"status": "UserStatus.OFFLINE",
"name": "098767800",
"bot": false,
"username": "None"
}
]
I have tried the following codes, append() function appeas to not working correctly; it only append the last item:
import json
with open('target_user.json', 'r', encoding='utf-8') as f:
jsons = json.load(f)
jsons2 = []
for item in jsons:
if item['userid'] not in json2:
jsons2.append(item)
with open('target_user2.json', 'w', encoding='utf-8') as nf:
json.dump(jsons2, nf, indent=4)
A quick help is very appreciated.
This should do what you need:
import json
with open('target_user.json', 'r', encoding='utf-8') as f:
jsons = json.load(f)
ids = set()
jsons2 = []
for item in jsons:
if item['userid'] not in ids:
ids.add(item['userid'])
jsons2.append(item)
with open('target_user2.json', 'w', encoding='utf-8') as nf:
json.dump(jsons2, nf, indent=4)