Avatar

anonymised people of the global south

@metamatar / metamatar.tumblr.com

call me antara. no longer in new delhi. immigrated to the they/them zone. tme. twenties.
Avatar
Avatar
brunelsblog
Besides making Nepal a captive market for its industrial goods, the other specificity of Indian expansionist exploitation and oppression is its control over Nepal's natural resources, mainly the latter's rich water resources. Most of the rivers which irrigate the most populated northern Gangetic plains flow through Nepal and the cheapest and the easiest resource of energy required by India for future industrialisation and general consumption can be the huge water resources of Nepal, which has the second largest water resource potential in the world (out of estimated potential of 83000 Megawatt of hydro-power only 0.5 percent has been tapped so far). That is why the Indian expansionists have been in the past usurping Nepal's water resources mainly for irrigation purposes through the Sharada Darn Agreement in 1920, the Kosi Agreement in 1954 and the Gandaki Agreement in 1959. However, in 1996 through the so-called "Integrated Mahakali Development Project Agreement" they have taken full control of whole of the Mahakali river for the irrigation and power purposes. The earlier concluded Kosi and Gandaki Agreements were nakedly semi-colonial treaties as they had deprived irrigation to the Terai, the grain bowl of Nepal, by diverting all the irrigation water to India through the dams constructed just at the Nepalese side of the border (allowing only the negligible amount of water to Nepal and prohibiting to build other dams upstreams for a considerable distance). The present Mahakali Treaty, however, has adopted a more fatal form of neo-colonial exploitation and oppression by talking equality in theory but in practice ensuring monoply in the use of water and electricity to the Indian expansionists and instead imposing trillions of rupees of foreign debt upon Nepal. Besides this, through the "Joint Communique" of June 10, 1990, the Indian expansionists have opened the door for exercising monopoly over Nepal's most important water resources in future by declaring all the rivers of Nepal as "common rivers" for India as well.
- Baburam Bhattarai, Politico-Economic Rationale of People's War in Nepal (1998)
Avatar
Avatar
closet-keys

it’s so wild that trumps executive order has that bit about sex being assigned at conception.

I assume this is the anti-abortion position being encoded into the anti-trans legislation. no one is testing what gametes their embryo has— such a thing isn’t possible (embryos don’t produce them yet; there are primordial germ cells, but they aren’t differentiated as sperm or egg cells).

it just goes to show that gender/sex policing is never based on any science— there is no "scientific" definition to what “female sex” or "male sex" is. It doesn’t exist as a material reality. It is a legal construct that is assigned with the first legal documentation assigned to a subject—the birth certificate.

and that isn’t based on chromosomes or gametes. it’s a decision based on whether a doctor thinks the infant will be mostly likely to be capable of penetrative heterosexual sex as “female” or “male.” because penetrative heterosexual sex is the point, the determination is based on whether the clitoris is long enough to penetrate (in which case “it’s a boy”) and if not whether there is a vaginal canal deep enough for penetration (in which case “it’s a girl”) and if neither or both are true the child’s body is forcibly surgically altered.

they can't define sex "biologically" because it is not a "biological" reality. it will be determined, as always, punitively, on a case by case basis on varying traits and legal documentation, to enforce patriarchal economic systems that depend on misogyny and transmisogyny.

one of the most telling parts from the executive order, to me, was this (emphasis mine):

"The erasure of sex in language and policy has a corrosive impact not just on women but on the validity of the entire American system. Basing Federal policy on truth is critical to scientific inquiry, public safety, morale, and trust in government itself."

this bit gives the game away with how important the institution and enforcement of sex assignment is to capitalism, the united states economy, the carceral/legal system, and the ideological buy-in to us-american hegemony.

Avatar
reblogged

it would be awesome if mushrooms were high protein and/or high calorie because they are so tasty but sadly they're basically just water... like i should be able to use them as my main protein source

Avatar

genuinely should be regulated, under accessibility regulations how many ppm of air freshner can be sprayed in a closed room. im about to throw up rn and already have a headache.

Avatar
reblogged
Avatar
ptactwo

it's hard work maintaining the levels of ignorance about music and cinema that i exhibit 👍🏽

i went to check this "angels" by this "robbie williams" bc with famous songs it's v common that ive heard them multiple times and even liked, just don't remember what they're called or who sang them, but this one i have literally never heard in my life

Avatar
reblogged

"don't make fun of people for not knowing something" ya but you understand there's a difference between actively making racist & dismissive assumptions about chinese people and like, not knowing exactly how chinese housing rentals work or whatever. right.

19th century ethnologists were expressing their horizon-expanding curiosity in an enlightening process of cultural exchange

Avatar
reblogged
Avatar
ralfmaximus

To understand what's going on here, know these things:

  1. OpenAI is the company that makes ChatGPT
  2. A spider is a kind of bot that autonomously crawls the web and sucks up web pages
  3. robots.txt is a standard text file that most web sites use to inform spiders whether or not they have permission to crawl the site; basically a No Trespassing sign for robots
  4. OpenAI's spider is ignoring robots.txt (very rude!)
  5. the web.sp.am site is a research honeypot created to trap ill-behaved spiders, consisting of billions of nonsense garbage pages that look like real content to a dumb robot
  6. OpenAI is training its newest ChatGPT model using this incredibly lame content, having consumed over 3 million pages and counting...

It's absurd and horrifying at the same time.

Avatar
myconetted

This is neither absurd nor horrifying, and two statements here are false (4 and 6).

1.8m of the 3m page requests were to robots.txt. If they weren't respecting robots.txt, then the site admin would see many, many more overall requests, and a significantly lower fraction of them would be to robots.txt.

The email explicitly says they have several billion websites, not just pages. In this case, it looks like they're all on different subdomains. There isn't a standard way to have robots.txt apply to all subdomains, which is why the person said robots.txt isn't a good solution.

Everyone doing LLM training does not just dump all the data directly into the training pipeline. Data processing happens before that step, but first they need to collect data to even process; that's what the crawler is. Part of the data processing step, after the crawling step, involves deduplication and quality filtering. It's misleading to say this is data they're training on; this is data going into the pool of stuff that might be trained on. That raw crawl has untold billions of pages no doubt filled with higher and lower quality garbage that will get filtered out before training.

The solution they're looking for is just to contact someone to have them blacklist either the whole domain or the IP address, since it's apparently all running from the same IP. With large crawlers, shit like this happens all the time—they even mention the same thing happened with Amazon.

This honestly isn't very exciting.

Avatar
mlembug

You can literally go to the relevant robots.txt file and examine it, all lines are commented out aside from the ones that tell *all* bots to not go to the /archive subsection of the site, which is presumably there to test if the bots are listening to instructions. Everything else is *allowed*.

OpenAI's web crawler downloads the robots.txt file, and handles it *correctly*.

The failure on OpenAI's side is by not having limits on how deep to go.

TL;DR: OpenAI's spider is not ignoring the sign that tells it to not go there, it reads it, correctly understands that it can go there, goes there and enters a web so complicated it gets lost and is unable to leave.

Avatar

I love tumblr users they reblog posts that open with I'm probably wrong about this from the same blog that also posted 6000 words of sourced text.

You are using an unsupported browser and things might not work as intended. Please make sure you're using the latest version of Chrome, Firefox, Safari, or Edge.