Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opentelemetry support or hooks? #461

Open
JerrySievert opened this issue Aug 11, 2022 · 26 comments
Open

opentelemetry support or hooks? #461

JerrySievert opened this issue Aug 11, 2022 · 26 comments
Labels
enhancement New feature or request

Comments

@JerrySievert
Copy link

hi,

wondering if you've considered either adding opentelemetry support directly, or adding hooks for pre/post query execution?

just looking for a good way to get telemetry into your awesome library - thanks!

@porsager
Copy link
Owner

Hi @JerrySievert .. Thank you for the kind words..

I think it's a great idea to add some hooks so this can be achieved. I'll give some thoughts to how that could be achieved best.

@JerrySievert
Copy link
Author

fantastic! I look forward to seeing the addition, thanks for considering it.

@tomsanbear
Copy link

Hey @JerrySievert @porsager

I ended up taking a stab at instrumenting this library over here:
https://github.com/tomsanbear/opentelemetry-instrumentation-postgres

This is really just a result of a few hours trying to grasp this library's internals and finding a hacky way to get the query statement, along with start and end span markers. I'd love some recommendations @porsager if you have better thoughts on how to hook into this library. So far it works, but I wouldn't put it anywhere close to production without some more testing.

@JerrySievert
Copy link
Author

@tomsanbear that's awesome, but is there some trick to getting it to work?

const {
  PostgresInstrumentation
} = require('opentelemetry-instrumentation-postgres');

and then in my instrumentations when instantiating the sdk:

  instrumentations: [
    getNodeAutoInstrumentations(),
    new HapiInstrumentation(),
    new PostgresInstrumentation(),
    new NetInstrumentation()
  ]

is there something else I need to do?

@younanjo
Copy link

younanjo commented Mar 2, 2023

@porsager checking to see if there is any updates on supporting OpenTelemetry?

@porsager porsager added the enhancement New feature or request label Jun 25, 2023
@porsager
Copy link
Owner

I'm thinking the following to cover various use cases:

options.stats: true or sql.stats() gives you the following properties on the query result

result.duration // the complete duration of the query
result.waiting // the time the query waited to be sent
result.execution // the actual execution time (query sent to pg until response)

Now that's fine for hooking up or logging if required, but for the opentelemetry case, I'm thinking an option like this:

const sql = postgres({
  ...,
  onrequest() {
    // query queued
    return () => {
      // query sent
      return () => // query finished
    }
  }
})

I've got a branch with that setup here if you want to test it out https://github.com/porsager/postgres/tree/query-stats .. Let me know what you think:

@clouedoc
Copy link

What might be interesting would be to also provide hooks for different lifecycle events of the query;

i.e., for instance, there might be a high "time to connect to postgres" compared to "time of query execution"

@porsager
Copy link
Owner

porsager commented Jul 1, 2023

That's exactly what the above is? 😉

@clouedoc
Copy link

clouedoc commented Jul 1, 2023

Oh, haha, yes, sorry about that; I didn't pay enough attention 🤦‍♂️
Thanks for your library, by the way!

@gustavolzangelo
Copy link

I'm thinking the following to cover various use cases:

options.stats: true or sql.stats() gives you the following properties on the query result

result.duration // the complete duration of the query
result.waiting // the time the query waited to be sent
result.execution // the actual execution time (query sent to pg until response)

Now that's fine for hooking up or logging if required, but for the opentelemetry case, I'm thinking an option like this:

const sql = postgres({
  ...,
  onrequest() {
    // query queued
    return () => {
      // query sent
      return () => // query finished
    }
  }
})

I've got a branch with that setup here if you want to test it out https://github.com/porsager/postgres/tree/query-stats .. Let me know what you think:

@porsager this is perfect! I think these 3 stats are exactly what is needed. Is there any plan to implement this feature?

@hastebrot
Copy link

hastebrot commented Aug 14, 2023

My approach for manual instrumentation would be to wrap the sql function returned by postgres() in another function that first calls tracer.startSpan(), then awaits the original sql function, and then calls span.end() with optional error handling.

Then the trace exporter would receive startTimeUnixNano and endTimeUnixNano (as defined in otlp-transformer/src/trace/types.ts).

@mattsoftware
Copy link

I have tried to wrap the sql function but there is something I am missing. The code works great as long as the query does not use an embedded ${sql(...)} in the tmeplate.

Essentially I would like to wrap the function to instrument to AWS Xray, but if I do anything to resolve the promise sql returns then I loosethe ability to return the correct "thing" from my wrapped funciton, requring a rewrite of all the code which tries to call the function.

If anyone has an example wrapper function to be able to instrument the query and time taken that would be amazing!

@porsager
Copy link
Owner

porsager commented Feb 14, 2024

Did you try the https://github.com/porsager/postgres/tree/query-stats branch mentioned right above instead of trying to create a wrapper @mattsoftware ?

@mattsoftware
Copy link

I have attempted to use it to no avail. I added this in my package.json dependencies...
"postgres": "https://github.com/porsager/postgres.git#query-stats"

and then updated my sql options to this...

        const sql = postgres({
            host,
            port: databaseSecrets.port,
            database: databaseSecrets.dbname,
            username: databaseSecrets.username,
            password: databaseSecrets.password,
            ssl,
            idle_timeout: 3,
            max_lifetime: 60 * 15,
            onrequest: () => {
                console.log(
                    "======================= SQL REQUEST queued ======================="
                );
                return () => {
                    console.log(
                        "======================= SQL REQUEST sent ======================="
                    );
                    return () => {
                        console.log(
                            "======================= SQL REQUEST finished ======================="
                        );
                    };
                };
            }
        });

expecting to see those console.logs happen during and after the sql call. No errors, but nothing logged either. If you can hint as to what I am missing that would be fantastic as I think this would be perfect for me to instrument my sql calls!

Thanks.

@neerfri
Copy link

neerfri commented Feb 19, 2024

Hi @porsager
First thanks for the work on this library, it's awesome!

Do you have plans to merge the query-stats branch?
We would love to also try to instrument our DB calls with it.

One of the nice things about the postgres package is that it has 0 dependencies 👏 but if you don't mind adding a dependency on opentelemetry/api package (which has 0 dependencies too) I can create a PR that adds instrumentation directly to the library exporting not only tracing but also exception events and prometheus-style metrics (like query duration histograms, table read/write counters, connection stats, etc...).
opentelemetry is the de-facto standard in instrumentation and is vendor neutral. the opentelemetry/api package allows to instrument code that only gets activated when the hosting application registers concrete telemetry collectors by using the opentelemetry/sdk package and is a no-op otherwise.

I believe this can bring a lot of value to users of the library since they will get meaningful instrumentation by default.
Telemetry is often a component of an application that feature-developers are less involved with as they are building the system and in many cases it is added without changing anything in the application's code, see for example the auto-instrumentation of node.js applications.
By supporting opentelemetry in the library we can allow these teams to receive great instrumentation out of the box.

Below is an image of one of our traces, I'm attaching it here to give some visual context that will explain how valuable this is. In the image is a request traveling through our different services and modules, as of now we cannot see the DB requests since we had no way to instrument postgres but we imagine how insightful it would be if we cloud 😄

image

@tobiasmuehl
Copy link

@neerfri What a generous offer. I hope it will be accepted by the library author. In the meantime, are you willing to maintain a fork of this library to get tracing implemented? I would most certainly use the fork! 🙇‍♂️

@neerfri
Copy link

neerfri commented Feb 20, 2024

@tobiasmuehl Thanks for your kind words and support ❤️
Let's wait a few days for @porsager to respond to see if the general direction he wishes to rake is to integrate opentelemetry in the library or expose an instrumentation hook and integrate with the library using a wrapper, I would not want us to be working of a fork if the direction is not to integrate it via the library.

@porsager
Copy link
Owner

porsager commented Feb 20, 2024

@neerfri Thanks for the kind words — Yes, the onrequest feature will be added once I find time to do more work on Postgres.js, and I hope the design is specifically so that it's very easy to add something like opentelemetry simply by having an eg postgres-opentelemetry module that you pass to onrequest.

I see absolutely no need for including it as a dependency here if a standalone module can work fine using onrequest?

Would be cool to see how it's working for you using the query-stats branch.

@abustany
Copy link

abustany commented Mar 26, 2024

I took at stab for our internal use at this. TL;DR is the onquery API looks fine to me, and works as intended 👍

  • I cherry picked commit f20f276 on top of tag 3.4.3
  • I removed the code related to stats to keep the patch as small as possible (I only need tracing for now)
  • I applied that as a pnpm patch (but I suppose it can work the same with any other package manager) - I'm attaching the patch in case anyone is interested

After that, I added the following very crude code to our connection options:

import { SpanKind, SpanOptions, SpanStatusCode, Tracer, context } from "@opentelemetry/api";
import { hrTime, hrTimeDuration, hrTimeToMilliseconds } from "@opentelemetry/core";
import { SEMATTRS_DB_STATEMENT, SEMATTRS_NET_PEER_NAME } from "@opentelemetry/semantic-conventions";

type QueuedQuery = unknown;
interface Query extends PromiseLike<unknown> {
  string: string;
}
type QueryResults = unknown;
type QueryQueuedHandler = (q: QueuedQuery) => QueryStartedHandler | void;
type QueryStartedHandler = (q: Query) => QueryResultHandler | void;
type QueryResultHandler = (results: QueryResults) => void;

function tracingQueryHandler(tracer: Tracer, serverName: string): QueryQueuedHandler {
  return () => {
    const startTime = hrTime();

    return (q) => {
      const queuedTimeMs = hrTimeToMilliseconds(hrTimeDuration(startTime, hrTime()));
      const spanOptions: SpanOptions = {
        kind: SpanKind.CLIENT,
        attributes: {
          [SEMATTRS_NET_PEER_NAME]: serverName,
          [SEMATTRS_DB_STATEMENT]: q.string,
          "db.postgresjs.queued_time_ms": queuedTimeMs,
        },
      };

      const ctx = context.active();
      const span = tracer.startSpan("query", spanOptions, ctx);

      q.then(
        () => {
          span.setStatus({ code: SpanStatusCode.OK });
          span.end();
        },
        (e) => {
          span.recordException(e);
          span.setStatus({ message: e.message, code: SpanStatusCode.ERROR });
          span.end();
        }
      );
    };
  };
}

my DB connection options now look like this:

import { trace } from "@opentelemetry/api";

const tracer = trace.getTracer("postgres"); // change the name to whatever makes sense for you
const opts = {
  // other options...
  onquery: tracingQueryHandler(tracer, `${hostname}:${port}`),
}

Here's an example of trace produced by the above, including only the relevant fields:

{
  "parent_span_id": "3cb06a353a1e27db",
  "scope_name": "postgres",
  "service_name": "v2",
  "span_attributes": {
    "db.postgresjs.queued_time_ms": 0.054407,
    "db.statement": "select \"space_id\", \"id\", \"parent_id\", \"name\", \"creator_id\", \"create_time\", \"update_time\", \"delete_time\" from \"space_folders\" where (\"space_folders\".\"space_id\" = $1 and \"space_folders\".\"parent_id\" = $2 and \"space_folders\".\"delete_time\" is null) order by \"space_folders\".\"name\"",
    "net.peer.name": "127.0.0.1:25432"
  },
  "span_duration_millis": 11,
  "span_end_timestamp_nanos": 1711451792875317000,
  "span_fingerprint": "v2\u00003\u0000query",
  "span_id": "b2a78b4cea9fc7fc",
  "span_kind": 3,
  "span_name": "query",
  "span_start_timestamp_nanos": 1711451792864000000,
  "span_status": {
    "code": "ok"
  },
  "trace_id": "6fe69188c84b104b6229d78b679da79b"
}

Disclaimer: my understanding of OpenTelemetry is quite limited, so expect nonsense in all of the above.

Patch that can be applied using pnpm patch or similar:

From b1b1c03b46350b667286dc1f77a7ee104176881e Mon Sep 17 00:00:00 2001
From: Rasmus Porsager <[email protected]>
Date: Mon, 26 Jun 2023 00:39:02  0200
Subject: [PATCH] Add onquery handler

---
 src/connection.js |  1  
 src/index.js      |  2   
 src/query.js      | 30                          -----
 3 files changed, 28 insertions( ), 5 deletions(-)

diff --git a/src/connection.js b/src/connection.js
index 7d97a4b..9f1989f 100644
--- a/src/connection.js
    b/src/connection.js
@@ -165,6  165,7 @@ function Connection(options, queues = {}, { onopen = noop, onend = noop, onclose
         : (query = q, query.active = true)
 
       build(q)
       q.onquery && (q.onquery = q.onquery(q))
       return write(toBuffer(q))
         && !q.describeFirst
         && !q.cursorFn
diff --git a/src/index.js b/src/index.js
index 0573e2b..97de5a3 100644
--- a/src/index.js
    b/src/index.js
@@ -85,6  85,7 @@ function Postgres(a, b) {
 
   function Sql(handler) {
     handler.debug = options.debug
     handler.onquery = options.onquery
 
     Object.entries(options.types).reduce((acc, [name, type]) => {
       acc[name] = (x) => new Parameter(x, type.to)
@@ -491,6  492,7 @@ function parseOptions(a, b) {
     onclose         : o.onclose,
     onparameter     : o.onparameter,
     socket          : o.socket,
     onquery         : o.onquery,
     transform       : parseTransform(o.transform || { undefined: undefined }),
     parameters      : {},
     shared          : { retries: 0, typeArrayMap: {} },
diff --git a/src/query.js b/src/query.js
index 0d44a15..bf1c087 100644
--- a/src/query.js
    b/src/query.js
@@ -13,6  13,9 @@ export class Query extends Promise {
       reject = b
     })
 
     this.resolver = resolve
     this.rejecter = reject
 
     this.tagged = Array.isArray(strings.raw)
     this.strings = strings
     this.args = args
@@ -23,19  26,29 @@ export class Query extends Promise {
     this.state = null
     this.statement = null
 
-    this.resolve = x => (this.active = false, resolve(x))
-    this.reject = x => (this.active = false, reject(x))
-
     this.active = false
     this.cancelled = null
     this.executed = false
     this.signature = ''
     this.onquery = handler.onquery;
 
-    this[originError] = this.handler.debug
     this[originError] = handler.debug
       ? new Error()
       : this.tagged && cachedError(this.strings)
   }
 
   resolve(x) {
     this.active = false
     this.onquery && (this.onquery = this.onquery(x))
     this.resolver(x)
   }
 
   reject(x) {
     this.active = false
     this.onquery && (this.onquery = this.onquery(x))
     this.rejecter(x)
   }
 
   get origin() {
     return (this.handler.debug
       ? this[originError].stack
@@ -137,7  150,13 @@ export class Query extends Promise {
   }
 
   async handle() {
-    !this.executed && (this.executed = true) && await 1 && this.handler(this)
     if (this.executed)
       return
 
     this.executed = true
     await 1
     this.onquery && (this.onquery = this.onquery(this))
     this.handler(this)
   }
 
   execute() {
@@ -171,3  190,4 @@ function cachedError(xs) {
   Error.stackTraceLimit = x
   return originCache.get(xs)
 }
 
-- 
2.44.0

@abustany
Copy link

hmm it seems though that in a "real" setup not all queries are logged... Not sure yet where the issue is 🤔

@abustany
Copy link

we overwrite handler.onquery, but the handler is shared among many queries

@abustany
Copy link

I updated the patch above to copy the onquery handler on each Query instance, that query can then do whatever it wants with it.

@porsager porsager mentioned this issue May 27, 2024
@spacetoha
Copy link

Dear @porsager,

Are there any plans to merge the https://github.com/porsager/postgres/tree/query-stats soon?

Thanks,
Anton

@porsager
Copy link
Owner

sorry no plans right now

@devthejo
Copy link

devthejo commented Jul 7, 2024

thanks @abustany
the patch, is working great (used yarn patch from yarn 4), applied on postgres 3.4.4, I had to change q.string to q.strings.join("?")
here is my yarn patch: https://codeberg.org/devthefuture/modjo/src/branch/master/.yarn/patches/postgres-npm-3.4.4-90e4c53e1e.patch
and here is the implementation: https://codeberg.org/devthefuture/modjo/src/branch/master/plugins/postgres/tracing.js

@mbrevda
Copy link

mbrevda commented Jul 11, 2024

Anything we can do to help move this forward?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests