Search code examples
python-3.xparsingfinancexbrl

How can understand semantic meaning for different value?


I want to get apple's financial data , download https://www.sec.gov/files/dera/data/financial-statement-and-notes-data-sets/2022_01_notes.zip from https://www.sec.gov/dera/data/financial-statement-and-notes-data-set.html.Extract it and put it in the /tmp/2022_01_notes.You can get the table sub,num and field definiton in the webpage https://www.sec.gov/files/aqfsn_1.pdf.

I compute the zip file's MD5 message digest.

md5sum  2022_01_notes.zip
b1cdf638200991e1bbe260489093bf67  2022_01_notes.zip

You can download it from official webpage or my dropbox:

https://www.dropbox.com/s/5ntwasipze8vr29/2022_01_notes.zip?dl=0

No matter where you download it from ,please check the md5sum value,maybe SEC uploaded wrong file and they will update the zip file in the future.

import pandas as pd
df_sub = pd.read_csv('/tmp/2022_01_notes/sub.tsv',sep='\t')
df_sub[df_sub['cik'] == 320193]  #apple's cik is 321093
df_sub
                      adsh     cik       name     sic countryba stprba     cityba  ...               instance nciks aciks pubfloatusd floatdate floataxis floatmems
4329  0000320193-22-000006  320193  APPLE INC  3571.0        US     CA  CUPERTINO  ...  aapl-20220127_htm.xml     1   NaN         NaN       NaN       NaN       NaN
4731  0000320193-22-000007  320193  APPLE INC  3571.0        US     CA  CUPERTINO  ...  aapl-20211225_htm.xml     1   NaN         NaN       NaN       NaN       NaN

0000320193-22-000007 is a access number for its 2022Q2 data.

df_num = pd.read_csv('/tmp/2022_01_notes/num.tsv',sep='\t')
#get all apple's financial data in xbrl concepts format
df_apple = df_num[df_num['adsh'] == '0000320193-22-000007' ]
#extract only one concept ----RevenueFromContractWithCustomerExcludingAssessedTax
#it is revenue mapping into financial accountant concept from xbrl taxonomy.
df_apple_revenue = df_apple[df_apple['tag'] == 'RevenueFromContractWithCustomerExcludingAssessedTax']
df_apple_revenue_2021 = df_apple_revenue[df_apple_revenue['ddate'] == 20201231]
df_apple_revenue_2021

It is too long to display the dataframe on my terminal console,i write into a excel

df_apple_revenue_2021.to_csv('/tmp/apple_revenue_2021.csv')    

and show it in the excel,paste the content here.

enter image description here

For the first two lines ,what does 8285000000 and 15761000000 mean?Please give a rational description for 8285000000 and 15761000000.

0000320193-22-000007    RevenueFromContractWithCustomerExcludingAssessedTax us-gaap/2021    20201231    1   USD 0xf159835fd3644f228d15724ad9d1837c  0   8285000000      0   1       0.013698995 5   -6
0000320193-22-000007    RevenueFromContractWithCustomerExcludingAssessedTax us-gaap/2021    20201231    1   USD 0x58c22680ab8dbbfb662ff4e14055c1bd  1   15761000000     0   1       0.013698995 5   -6

Solution

  • To explain these figures, you have to tie back to the filing from which they were extracted. In this case, the filing with the accession-number of 0000320193-22-000007 is Form 10-Q For the Fiscal Quarter Ended December 25, 2021. If you check in that filing, you'll find, for example, seven of the value numbers in your dataframe in the table Net sales by reportable segment specifically Three Months Ended December 26,2020.

    So, for example, 8285000000 refers to the Japan segment for that period, while 15761000000 is in the Net sales by category table for the Services category for the same reporting period. That table contains six more of the values in the dataframe.