Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Stealth being detected by Chrome DevTools Protocol (CDP) #899

Open
brauliobo opened this issue Jul 2, 2024 · 32 comments
Open

[Bug] Stealth being detected by Chrome DevTools Protocol (CDP) #899

brauliobo opened this issue Jul 2, 2024 · 32 comments
Labels
issue: bug report A bug has been reported needs triage

Comments

@brauliobo
Copy link

Puppeeteer stealth is now being easily detected, checkout https://deviceandbrowserinfo.com/learning_zone/articles/detecting-headless-chrome-puppeteer-2024

@vladtreny
Copy link

It is not possible to fix it properly without modifying the Chromium source.

If anyone could start a project for a custom Chromium, would be great

@brauliobo
Copy link
Author

brauliobo commented Jul 11, 2024

It is possible to replace puppeteer with a websocket connection through a browser extension (loaded with --load-extension) and then control the browser with chrome.scripting.executeScript calls. I've tested and verified in a PoC.

I wonder though if it is enough to use Webdriver instead of CDP to communicate with Puppeteer

@vladtreny
Copy link

Interesting idea. Before puppeteer, I used extensions. Puppeteer is cool, but extensions provide additional useful apis.

You can also call cdp commands on the background of extension. It is not detectable.

@nwebson
Copy link

nwebson commented Aug 7, 2024

I made a patch for this issue, it disables Runtime.enable that causes this leak.
You can check it out here: https://github.com/rebrowser/rebrowser-patches

@vladtreny
Copy link

^ This way we lose a lot of funcs of puppeteer. We need a chromium patch... if anyone could start a project

@nwebson
Copy link

nwebson commented Aug 8, 2024

^ This way we lose a lot of funcs of puppeteer. We need a chromium patch... if anyone could start a project

@vladtreny I see no loss in functions of puppeteer after the patch. If you could find any, please let me know via issues section, I will be happy to address it.

@vladtreny
Copy link

console.log does not work
click does not work
other cdp functions do not work

@nwebson
Copy link

nwebson commented Aug 8, 2024

  • console.log works just fine, but you don't get Runtime.consoleAPICalled events
  • click and elements query work fine, too
  • what other cdp functions?

@vladtreny
Copy link

Do you get even here?
page.on('console', async message => {

Also, show how do you click?

@nwebson
Copy link

nwebson commented Aug 8, 2024

page.on('console') relies on Runtime.consoleAPICalled, it won't work, that's true.
page.click(selector, clickOptions) - this one works fine with my patch

@vladtreny
Copy link

how does it detect element to click? can i select it correctly inside random 10 iframes?

does it click inside closed shadow root?

@nwebson
Copy link

nwebson commented Aug 8, 2024

@vladtreny it feels like I have to defend myself for something... I'm not trying to sell you anything.

You can try my solution, find any non-working stuff in patched version, and open a new issue for this. I will be glad to assist. Thanks.

@andrewmcwatters
Copy link

It seems less than optimal that one has to patch a browser to prevent this non-standard stack field from leaking side-effects.

Has anyone else here also tried just not using Chrome? A preliminary test from our end showed that the only leaked behavior from Firefox was window.navigator.webdriver, and I'm not sure what the state of the art is to patch that, but IIRC, it's not as simple as setting it to false.

@andrewmcwatters
Copy link

I should add that most detection strategies explicitly target Google Chrome, but there's far fewer explicit detection strategies for Firefox it seems. It's also Tor Browser's browser of choice for a reason.

@nwebson
Copy link

nwebson commented Aug 8, 2024

@andrewmcwatters since this post yesterday - https://hacks.mozilla.org/2024/08/puppeteer-support-for-firefox/ - I guess detection strategies will evolve quite quickly to target Firefox, too.

@andrewmcwatters
Copy link

andrewmcwatters commented Aug 8, 2024

Yeah, I'm wanting to diversify away from just automating with Google Chrome and stealth measures, since it's a bit of a risk factor at this point.

I'm not sure what WebDriver BiDi's equivalent of Page.addScriptToEvaluateOnNewDocument is, though, or if it has one. It might be necessary to build an equivalent event from existing standard ones.

@andrewmcwatters
Copy link

andrewmcwatters@Andrews-iMac redacted % node --test
▶ tests
  ✔ https://arh.antoinevastel.com/bots/ (5887.661417ms)
  ✔ https://arh.antoinevastel.com/bots/areyouheadless (2896.569986ms)
  ✔ BotD (1423.917009ms)
  ✖ Fingerprint Pro Bot Detection (3997.748462ms)
    AssertionError [ERR_ASSERTION]: Expected values to be strictly equal:
      actual - expected
    
      'You are a bot'
    - 'You are not a bot'
               ^
        at TestContext.<anonymous> (redacted)
        at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
        at async Test.run (node:internal/test_runner/test:857:9)
        at async Suite.processPendingSubtests (node:internal/test_runner/test:565:7) {
      generatedMessage: true,
      code: 'ERR_ASSERTION',
      actual: 'You are a bot',
      expected: 'You are not a bot',
      operator: 'strictEqual'
    }

  ✖ BrowserScan (1719.802858ms)
    AssertionError [ERR_ASSERTION]: Expected values to be strictly equal:
      actual - expected
    
      'Robot'
    - 'Normal'
        at TestContext.<anonymous> (redacted)
        at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
        at async Test.run (node:internal/test_runner/test:857:9)
        at async Suite.processPendingSubtests (node:internal/test_runner/test:565:7) {
      generatedMessage: true,
      code: 'ERR_ASSERTION',
      actual: 'Robot',
      expected: 'Normal',
      operator: 'strictEqual'
    }

▶ tests (15927.932604ms)
ℹ tests 5
ℹ suites 1
ℹ pass 3
ℹ fail 2
ℹ cancelled 0
ℹ skipped 0
ℹ todo 0
ℹ duration_ms 16215.926107

@vladtreny
Copy link

Need to patch chromium. Nothing to do.
All these hacks are detectable.

Also, the new protection in cloudflare via shadow root, is hard to bypass. Possible, but not ideal.

@ottodriver
Copy link

Seems to me that this or some other leak is affecting recaptcha bypass since yesterday.
Getting challenge 90% of the time on v2 and low score on v3.
Can anyone confirm?

@andrewmcwatters
Copy link

Need to patch chromium. Nothing to do. All these hacks are detectable.

Also, the new protection in cloudflare via shadow root, is hard to bypass. Possible, but not ideal.

Bummer.

@Vinyzu
Copy link

Vinyzu commented Aug 13, 2024

Need to patch chromium. Nothing to do.

All these hacks are detectable.

Also, the new protection in cloudflare via shadow root, is hard to bypass. Possible, but not ideal.

Its possible. See selenium-driverless. Im also currently working on an open-source (playwright based) solution.

@vladtreny
Copy link

Its possible. See selenium-driverless. Im also currently working on an open-source (playwright based) solution.

I know it... just rechecked if they added something new.
What kind of value does it bring? Turns off runtime? This way we lose a lot of features.
Including ability to bypass this new cloudflare shadow root.

Cloudflare reads these threads, but anyway :)

@nwebson
Copy link

nwebson commented Aug 13, 2024

Its possible. See selenium-driverless. Im also currently working on an open-source (playwright based) solution.

I know it... just rechecked if they added something new. What kind of value does it bring? Turns off runtime? This way we lose a lot of features. Including ability to bypass this new cloudflare shadow root.

Cloudflare reads these threads, but anyway :)

You keep saying about losing some features, but you never provide any specific code that stop working when Runtime is off. If you could, that would be really useful for the community.

@vladtreny
Copy link

For example, universally find an element in runtime.
Click inside shadowroot closed.

@andrewmcwatters
Copy link

Its possible. See selenium-driverless. Im also currently working on an open-source (playwright based) solution.

I know it... just rechecked if they added something new. What kind of value does it bring? Turns off runtime? This way we lose a lot of features. Including ability to bypass this new cloudflare shadow root.
Cloudflare reads these threads, but anyway :)

You keep saying about losing some features, but you never provide any specific code that stop working when Runtime is off. If you could, that would be really useful for the community.

It's also not desirable to use Selenium, but use a non-Selenium API.

@nwebson
Copy link

nwebson commented Aug 13, 2024

@vladtreny do you have any example of code that breaks after disabling Runtime.enable command?
@andrewmcwatters could you please clarify?

@Vinyzu
Copy link

Vinyzu commented Aug 14, 2024

Many features break after not enabling Runtime.
But as i said im currently working on a playwright solution, which fixes !every! issue.

@andrewmcwatters
Copy link

andrewmcwatters commented Aug 14, 2024

@vladtreny do you have any example of code that breaks after disabling Runtime.enable command? @andrewmcwatters could you please clarify?

I'm talking specifically about Selenium-Driverless, not so much about your patches disabling the Runtime.enable command. The README.md of https://github.com/kaliiiiiiiiii/Selenium-Driverless reads:

Note: This project is moving away from the selenium syntax

I'm not going to try and speak for everyone, but I think a lot of us are looking for drop-in solutions like yours.

My business uses Selenium, though, we don't use the other automation frameworks in part because they're explicitly not designed for anything other than testing. You can use them for other purposes, but it's leads to hacking around their APIs.

@NCLnclNCL
Copy link

NCLnclNCL commented Sep 23, 2024

Runtime.enable

you can wrap cdp to use, dont use Runtime.enable and Runtime.consoleAPICalled, it can detected
Pp, playwrite, selenium use Runtime.enable

@nwebson
Copy link

nwebson commented Sep 25, 2024

📣 I just made a post how to access main world objects from isolated context, please read it here: How to Access Main Context Objects from Isolated Context in Puppeteer & Playwright

ℹ️ TLDR: use page.evaluateOnNewDocument to listen for window messages and eval code, use page.evaluate to send window message with your payload and get response from the main world.
It worked just fine for several clients of mine, please let me know (discord or here) if it doesn't work for you.

@vladtreny
Copy link

@nwebson

can we add you to our Telegram group about anti-detect?
there are anti-detect browsers and other advanced bot writes

@nwebson
Copy link

nwebson commented Oct 9, 2024

@vladtreny sure, I'll be glad to join, my tg username is Nickwebson

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
issue: bug report A bug has been reported needs triage
Projects
None yet
Development

No branches or pull requests

7 participants