Search code examples
pythonbeautifulsouppython-requestspython-triohttpx

Unable to validate my query for particular website


I'm trying to query the following website

enter image description here

where Access Number will be a fixed value of 8778791867

And the PIN will be dynamic.

From the normal browser am able to check if it's valid or invalid pin.

But using my code below, I'm unable to get the exact answer as the browser get. as i keep getting invalid for all entries!

import httpx
import trio
import sys
import re
from bs4 import BeautifulSoup
from termcolor import colored
import os

# proxies = {
#     'https': 'http://127.0.0.1:8866'
# }


async def GetValues(client):
    r = await client.get(baseurl)
    soup = BeautifulSoup(r.text, 'lxml')
    return soup.select_one(
        '#__VIEWSTATE')['value'], soup.select_one('#__EVENTVALIDATION')['value']


async def GetToken(client):
    headers = {
        'Content-Type': 'application/x-protobuffer'
    }
    data = "\n\u0018qc5B-qjP0QEimFYUxcpWJy5B\u0012Ž\u000b03AGdBq264NJCBRJatpe-BlWUNra15cx9i1vtKgAz1cUbsPDMuuhHWpUB1CYdgH3nbhMONc8_uOtU1T3h5hIL0CplyaAt579wUhKd8UshQxTG9L-WRv1W1kNXu8iSF8MzlBxLjrySrHXux6eOm8HG9-oSmdonrvCgv4hkCK43XX_I3leXuhALavOAM06BR-3jYT1Kp0P0OiF8NMyswLsDS9jJUJ-2-290TrTwkuVCEB5YfuJxOvpemh_9iNkmGKRjCbzt5pPoTU_YOp0x2ycmYPNPpPsO9hL35FMiZys6y65q4ynZawQyNYIrAHp4cxAD8Pgj6YmS6Q_cGDTvLsKX60ZeV_W88uWjOC9rTcMHVvpBhbaAukM7p1T2hobQ94fYs1Dabk5LFdRkGUMg-QzWV5OSTXcc-zk7WBQPnF9e4uF2ii3gK1neLoQibb0453SwOp-2PWNbxbjA-nNvj8I-n5ePBV7dkM3P_31E7SOrycN04Wymy5rrV_nX74hLIDBoFph1BOE5aV8f4qu7wHquyPei4Vuq_QoTtLKhmwmSh4xP5-qbrB_N8azx8RTO9EJbMBmgh5FLgqhS4AG-M1YXrsjnSeYmJwXggl7wuvHfkjdzEKXKVdC1egfjC8T-HO-LWozjuDxFmNSioZYjV21JBsWW2DXN5HdF9jbgXuSEXAQNue7yUdPn7Ke6uWFYy8tI9jANel-qegV9fFl5m5l6c0Dl4pXHyBdjMgBPnjzm9yk2LBPyZHr9YxqgE7hDDR6yMBu_omo-NLvP6GYVT_M4Qel5_SM3mKTElkSSH20fgvfCJ6Es1Ggct9XaCplro_PoXz6xirr2sv2QKfmt--Aaoqh95iRYfW3mt8Vs9B2b7jMoB544lR0UQ8lQ8bBa5arB9ftPDsn6uWzMl5qV083r30ytI5vi1golZUbsQLcyUOcKndtSW7dLbhTnpTGM77tz_XPPUwQYkk2sKQiyL1uXSV91zWmM_FFqkv88Xv2zsWjNKP_Mc3Kkgz_vHmAf_yDkvwR2-_YtJaR4ucK6tia8YL5Zw09qDEkZDKv-wI6kKvciZybp-Fhfe7X9YwkTnwdmZYxOXfHRg9AxOSa3og7rsIhcjSLCdTRwepzbUWJVXnTgu9RqTKfAKY_Gh1RNdA4GvP2GkbB0CjMaEQeMfZKqpod-5PouvCy_nJnYDFbFr5L9cwYJq_9cDLZndpIEOz42nB0vZ-p2UYBnIXhPFx61KeGWII_K_QgZVgeYcmVRRE4raUxpeaQqxdiFmIs2W-V5jix0qL_GQC68T47xFQJsr6z1JquDFIZfP4JBRDl-pRorjm7D4CQCwTXS7TQylsMz7kO-Uic7T2oBJ5haPIuBr0ysBXgZ8Fa9SQ97COSqpXmh59RIfuB7tfllHBoF1H7i1XvV62AT7zo_b\u001a\t[55,0,34]\"ñ\u0019!t7GgsZjIAAU2PylnNkd85daFy3s8AV0EnyeEf2twTQcAAAA9VwAAAAZtAQecCYX5rWQiGoO-xoNngw6ze11i5zypQ8al8C0lt4qapS6clOXeVIoMUUx0UT4slNmi-FnRX_uAtXIxTiItTxglcnEZCPX9MV37eoFWZIVNq-GURUVVPAJroaTveC3aNq2UpnI4P8oqSMw3LBNqVu0EALll-Po7_wb_VE69dQ1YUUPSvPoiG6eW8VGra-rnBGo1m80y-ObOkl--dT4t7QAsBBPsLXP9r1vKRFWrCbtiYPAEsKzxZnqA-E-_CSc-CuD9Kb4AhRG1YM1dDV5e25FXMaql1pZh1IlvqHfT3Zf2ajIVXOz1EOo9bi9CZ41jOA9RuWG98LCkGzpiy5DKdfFcBUe6hDpYv-pVDMadkC3W8YhNucuOVyu0ikd5rEtDHSINiLjILm5rE33OlOcmVbJ0POcxKd_pInBQcfMdwCRV9qXYoOj8g60mPY4erlWibb_sPA80Ss3RiwaHjd5Ng8H-P5R5thyb9mHrOFgrgjAGqQEOvlSzeHNooJfPcgUTtot2D-Y-FZiEYCYbT-iEKCKqsrKw8sa6R-TyPnscG9r824IXl0QnpYe2kc4DveinzPL-HCMUGg2uYHpx3J0XsiUhFTmPhkcRVeC-kkoLLVUwRxnXdv0oFP6V4Aqh3D2Hn_lD4OlGYVlzf5pujBBdMaRjzEgEWVfxdcTg0thaoA2XCsh0N7rOI_ucqlg06KG_fxkyugmTuIDCZPUnFipOKX-c68w3TxpvznO8J92bIFQcNsmEOKc7yVJ9QqpMRKtoAFcQQxHHltQCYx8GKJp6jGTudwKnLiX_sevgZteL5IYK1eSjzl-Rc59pYpI6tQSDLDaKiZ5PpptYvvDzcbO3hEG242hn2gcKubHUiX3-nb3p4vPM2u1z9WCLiYw4y7wb-DEFg5ue48ZDb6LtpR9iVxACSgfDfL3Hrv9jH31PxNFJm44DmP6YLpcUpqOctrwPbbSLXxt2Q2iTxw_3MXVCG7dwa__7bdAo53hROcK9aCsd7-6cpM44TKde1lTJxH6Lo2v6lnuzIkhDYnHT60XgNhxp5R7Q_tXZeF1cj_nMZlj4yI0WESTHJ2vr8QerYLt7u9YKWjHFxH1tW6RJdWooO2GyjjsOrlKn6T5OW--OTm29fHWzRfGd_FMB2bK3G920jvI9YeZBranu8Qg9Z71XtOJMaNuVcewC9kEUo3HUD-vTW0kVOoe0g4ceu37Sb1H0cEhlcwa0a2wBH6nIbvVKnsrDilWqeImYKb-ttb7IVf2GGgcOEhTkUhU0cY1gutr5dMzMcfw3TXyIhPciBdkqB1Keo66wxXk0PViL8E_RDy0EnN-E4rKXALPOCWH8brk3wmp5M6Xdr6t2wZ2o9hG4GiXtaLn-wyRRKEJGSeV7yYz8pFH3ZOOPx-OxMDyRHtz_R_7PBtBN0iGStRA_zSuYJ4Zdm6DCV6ty2DB_jpzego6ITYiTHFvw3Eb9tsZQqTep6i7CqPrezo8-Zm0P9Y_9oqtawIx3hxgeW7CKZsly6yFaUQgdk4jbtFf8bExya90CqM9k2dHQ8ZqgxKPvITYLEXS2lPuqF24T7-B7zq7WOOfhKV6aBuZV0zn9DrEpj6D_uUzubR96Yw7bFB1-14nAchKO2WWQoMcoX_XJJ-TQWm5jc8Kx7TyfVx_5w7PIqO_gNUY59Uh5EueWwY7Ynl8ICpoBNgfYTo1YRm09Mvm3YUGeP_7Gp8Gnzozh_wb6zhYiS24SYCpzf1HC8mXeDucAkZKSBO5NGY3hc7pOOQD3s31a3fIjLNBKlUX_BoUWCIJDLcm02EgwAVUP9-12V8Kb2l4fg-ICOgiHMIezB928Ay4C0BfoOSZN8Hv_FGj9gZUmaDKPzPxuCDkCChjU5YGQur9KrRk2G0ttwkw0-BMpzkaHSdMWXGUPwDHvvSQwIGh7TqHEsesWhn-NenZ8ejz550hnjmR69zOP4JATrVlqafy4bteLVYiu0Fo0RcUHzuNmOetu4FI0VWcDlXXmDS3hG0jRy4NTQJ1FreroOVekqrEjzzWKrFkgMsi1SvQZGzVCsk-znkFxOHosbTScb8Kt4QbYmkhNFrqZXkDWx4V0CH05CAX4mIeVC-JY-JmeemUvNLSqwtsPQ-e2lWKGKPrMMYw1XOQ-N28FhcSGnOSiZdZfodaaOx4cs4iIq2TWday1UwUcd6k4YsRSkWohCQlYlf8SBPzKZgh__AWbeixFM4HBvUJVF7dN4Br3x83B3dNUGiCHNfzBq77EVYvcEZu8TOlxBli6ONE2p9W4AEnf1il54e1oe4M7jfMHXfgTKmgJsbm7bUlZ1c9V5FT8w6tapqw2TC1z4YblgM9Brt_VUuSI9nd0aMttX0Ow0Ma30FQTGzo8-LBeiK_Oe8dfyZIoK8CYdsG5hMcObuvd9WaNQ8vj49F_8tx_AyUz6SJCy_r1rV5oDVjTZetmICz1mmk8vQzC6kf7Hwm0ZbmxoxGmj-VsviyYTS1dVgEvdnwpUf2m8j-zC3CQTmHHmDu12EJXO4qh5bq4_EFkIdJiUu_IdrEbwaLNC0GL1MqPcTySypXmoS-HSqNdncg0GJ4Zprs_yYsfbpHB9Oj0cv2epDjFPvwYOpwmWPC9Vb7dJcyvXCsmugeEEgmh7Rb6aS1IJ1DCs4nfHUYZ_Gx8_aBi-GL4Zyh3V2sk_GhpBHXjDCsZ7e5_NIJTHHKHjMsjq2RzOYwjVZPMNs36WyDWLbxUltL01ix7ye8XFUR-xmO2gk_JZ40po4O04-PDxLJv1SVsLOnCgyKKZBehzRTHgsDslXT2_eU0QIoVQ7nb3Tj44z0VUjdegWaoNJXp3OUwGY2MNoogzgbS9M0fn6Qop9vaEQKJGhjUn33rhIKOc9MpvDkiWwcAisxu9o-UH1OMs9Y-9EQF35TzhsMauE4G3L2so-yKdtKL-N8y728V-xXHBXnA2pzdj66bdsgubhcpPgqXGPKV05U6IX7z61VkltXoUnxtQqzXCjk1sK1EVIy3WlNisMB0TJgmZxusfIFhIRLfiks0og4lEDuTFOfqW1nBc4uja3Ud89kEYvygvDvQY6Iy4IDQYDKWiP7hP8r5XyckWdA3a3zqPg8Rx_XWSwRTPmeyV8qcgaH0YAab4pkCEYytcrZqF8GtWFa5_-deX1kjqRHbp__m-QwajI3N5N6ow4WprUojCZlDi3tIdYlN5hE19NlIyNMqybERrzG29eo7SFuRRa14Wjx5Y7zGE9Pd-gHpJ0dl2fjJ_3RtgStu*\n-7587942622\u0001qB\bRecharger(6Le8DfAUAAAAAKsECBdLJ0Z_I7TlcVufkB2QdCCi"
    r = await client.post(
        'https://www.google.com/recaptcha/api2/reload?k=6Le8DfAUAAAAAKsECBdLJ0Z_I7TlcVufkB2QdCCi', data=data.encode('latin-1'), headers=headers)
    return re.search('"(\d.*?)"', r.text).group(1)


async def main(numbers, baseurl):
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0"
    }
    # async with httpx.AsyncClient(timeout=None, headers=headers, proxies=proxies, verify=False) as client, trio.open_nursery() as nurse:
    async with httpx.AsyncClient(timeout=None, headers=headers) as client, trio.open_nursery() as nurse:
        async def check(num):
            values = await GetValues(client)
            # token = await GetToken(client)
            data = {
                "ctl00$masterScriptManager": "ctl00$attContent$ctl00|ctl00$attContent$btnSubmit",
                "__EVENTTARGET": "",
                "__EVENTARGUMENT": "",
                "__VIEWSTATE": values[0],
                "__VIEWSTATEGENERATOR": "C5057F25",
                "__EVENTVALIDATION": values[1],
                "ctl00$attContent$txtAccessNumber": "8778791867",
                "ctl00$attContent$txtPIN": num,
                "ctl00$attContent$hdnTier": "",
                # "ctl00$attContent$hdnTokenRecharge": token,
                "ctl00$EmailSignup$txtEmailSignup": "",
                "ctl00$hfUserId": "",
                "__ASYNCPOST": "true",
                "ctl00$attContent$btnSubmit": "Submit"
            }
            r = await client.post(baseurl, data=data)

            if 'Invalid' in r.text:
                print(f"Num: {num}, Status: {colored('Invalid','red')}")
            else:
                print(f"Num: {num}, Status: {colored('Valid','green')}")

        for num in numbers:
            nurse.start_soon(check, num)


if __name__ == "__main__":
    baseurl = 'https://www.virtualprepaidminutes.com/ATT_prepaid_calling_cards_refill_online.aspx'
    numbers = [5418531366, 5418531367]
    trio.run(main, numbers, baseurl)
    
    # if len(sys.argv) == 2:
    #     try:
    #         with open(sys.argv[1]) as f:
    #             numbers = f.read().splitlines()
    #         trio.run(main, numbers, baseurl)
    #     except FileNotFoundError as e:
    #         print(f"File {e.filename} is not exist!")
    # else:
    #     print(f"Usage: python {os.path.basename(__file__)} `InputFile`")

it's should return invalid for 5418531366 and valid for 5418531367

but for some reason i getting invalid for both numbers:

The following response is the actual HTML response.

enter image description here


enter image description here

Thanks in advance.

I would like to handle that out of selenium. as I've built a selenium script for this task too.


Solution

  • The site you are trying to automate has integrated google's reCAPTCHA v3 into its form:

     <script src="https://www.google.com/recaptcha/api.js?render=6Le8DfAUAAAAAKsECBdLJ0Z_I7TlcVufkB2QdCCi"></script></style>
    

    This is known issue for web scrapers and automation scripts. The reason you are getting invalid even for a valid submission as the server-side validation for the captcha fails and detects your script as a bot.

    reCAPTCHA is described by google as:

    reCAPTCHA uses an advanced risk analysis engine and adaptive challenges to keep malicious software from engaging in abusive activities on your website. Meanwhile, legitimate users will be able to login, make purchases, view pages, or create accounts and fake users will be blocked.

    Seems you have made an attempt to bypass it in GetToken function but it has obviously failed, I suggest you read more information about reCAPTCHA to understand why. See the official google docs about reCAPTCHA here and this stackoverflow answer.

    There is no workaround anyone here could provide you with, there are online paid services that offer captcha solving services. Personally I never had the need to use such service but you can google for them.