Что такое sampling java

Содержание

Sampling.java
4. Sampling
4.1 Declarative sampling
4.2 Custom sampling
4.3 Sampling in Spring Cloud Sleuth
VisualVM CPU Sampling
Sampling vs Profiling
Profiling
Sampling
When to Use CPU Sampling
How to Run CPU Sampling
Local applications
Remote applications
How to Interpet the Data
Difference between “Self Time” and “Self Time (CPU)”
Что такое sampling java
Field Summary

Sampling.java

Below is the syntax highlighted version of Sampling.java from §2.1 Static Methods.

/****************************************************************************** * Compilation: javac Sampling.java * Execution: java Sampling n p * * Suppose that p = 40% of the population favors candidate A. If we take a * random sample of n = 200 voters, what is the probability that less than * half the voters support candidate A? * * The number X of the n = 200 voters that support candidate A is binomial * distributed with mean n * p = 200 * 0.4 = 50 and variance * n * p * (1 - p) = 200 * 0.4 * 0.6 = 48. The standard deviation is * The probability that less than half the voters support candidate A is * the probability that X < n / 2 = 100. * * Using the normal approximation to the binomial distribution, the * probability that X < 100 is: * P(Z  * * ******************************************************************************/ public class Sampling  public static void main(String[] args)  int n = Integer.parseInt(args[0]); double p = Double.parseDouble(args[1]); double prob = Gaussian.Phi((0.5*n - p*n) / Math.sqrt(n*p*(1-p))); StdOut.println("probability normal"> + prob); > >

Источник

4. Sampling

Sampling may be employed to reduce the data collected and reported out of process. When a span is not sampled, it adds no overhead (a noop).

Sampling is an up-front decision, meaning that the decision to report data is made at the first operation in a trace and that decision is propagated downstream.

By default, a global sampler applies a single rate to all traced operations. Tracer.Builder.sampler controls this setting, and it defaults to tracing every request.

4.1 Declarative sampling

Some applications need to sample based on the type or annotations of a java method.

Most users use a framework interceptor to automate this sort of policy. The following example shows how that might work internally:

@Autowired Tracer tracer; // derives a sample rate from an annotation on a java method DeclarativeSampler sampler = DeclarativeSampler.create(Traced::sampleRate); @Around("@annotation(traced)") public Object traceThing(ProceedingJoinPoint pjp, Traced traced) throws Throwable < // When there is no trace in progress, this decides using an annotation Sampler decideUsingAnnotation = declarativeSampler.toSampler(traced); Tracer tracer = tracer.withSampler(decideUsingAnnotation); // This code looks the same as if there was no declarative override ScopedSpan span = tracer.startScopedSpan(spanName(pjp)); try < return pjp.proceed(); > catch (RuntimeException | Error e) < span.error(e); throw e; > finally < span.finish(); >>

4.2 Custom sampling

Depending on what the operation is, you may want to apply different policies. For example, you might not want to trace requests to static resources such as images, or you might want to trace all requests to a new api.

Most users use a framework interceptor to automate this sort of policy. The following example shows how that might work internally:

@Autowired Tracer tracer; @Autowired Sampler fallback; Span nextSpan(final Request input) < Sampler requestBased = Sampler() < @Override public boolean isSampled(long traceId) < if (input.url().startsWith("/experimental")) < return true; > else if (input.url().startsWith("/static")) < return false; > return fallback.isSampled(traceId); > >; return tracer.withSampler(requestBased).nextSpan(); >

4.3 Sampling in Spring Cloud Sleuth

By default Spring Cloud Sleuth sets all spans to non-exportable. That means that traces appear in logs but not in any remote store. For testing the default is often enough, and it probably is all you need if you use only the logs (for example, with an ELK aggregator). If you export span data to Zipkin, there is also an Sampler.ALWAYS_SAMPLE setting that exports everything and a ProbabilityBasedSampler setting that samples a fixed fraction of spans.

The ProbabilityBasedSampler is the default if you use spring-cloud-sleuth-zipkin . You can configure the exports by setting spring.sleuth.sampler.probability . The passed value needs to be a double from 0.0 to 1.0 .

A sampler can be installed by creating a bean definition, as shown in the following example:

@Bean public Sampler defaultSampler() < return Sampler.ALWAYS_SAMPLE; >

You can set the HTTP header X-B3-Flags to 1 , or, when doing messaging, you can set the spanFlags header to 1 . Doing so forces the current span to be exportable regardless of the sampling decision.

Источник

VisualVM CPU Sampling

If you need to investigate CPU related issues, sampling provides an easy mechanism for identifying bottlenecks, with minimal effects on the performance.

Sampling vs Profiling

First of all, let’s understand the difference between sampling and profiling, which is a key prerequisite.

Profiling

Profiling involves instrumenting the entire application code or only some classes in order to provide runtime performance metrics to the profiler application. Since this involves changes to the application code, which are applied automatically by the profiler, it also means that there is a certain performance impact and risk of affecting the existing functionality.

The actual degree of the performance impact is hard to determine, but it can become significant if CPU intensive sections are instrumented.

Profiling is usually recommended for optimizing specific algorithms or when you’re interested in measuring the invocation counts.

Sampling

Sampling on the other side works by periodically retrieving thread dumps from the JVM. In this case, the performance impact is minor (and constant since the thread dumps are retrieved using a fixed frequency) and there’s no risk of introducing side effects. This process is a lot less intrusive and can also be performed quite reliably on remote applications (i.e. it could even be applied to production instances).

The main downside of CPU sampling is the accuracy — since the thread dump is retrieved at fixed intervals, there is a high risk of missing certain method invocations (especially the very fast ones). This means that the invocation count of methods is very innacurrate, but the total spent time (and CPU time) should still provide some relavant metrics.

When to Use CPU Sampling

Unless you are interested in very precise performance metrics (albeit affected by the added cost of instrumentation), you should use sampling most of the time. The main advantage of profiling is its accuracy, but since there’s the performance impact added by instrumentation, most of the performance metrics will be off by an unknown factor.

How to Run CPU Sampling

Local applications

For local applications, launch VisualVM from the JDK binary directory,

Select (double click) on the process that you would like to monitor on the left hand screen.
Click on the “Sampler” tab
When you are ready to perform your test, select the button “CPU” next to the “Sample” tab.
Once the test has finished, press “Stop” and press the “Snapshot” button.

Please keep in mind that the data displayed before taking a snapshot may or may not be very accurate. You should do your analysis only on snapshots.

A common error that people do when following this steps is to take an actual screenshot of the sampling screen. While it’s nice they were so thoughtful, the data is mostly pointless — as most of the time the performance bottlenecks will be somewhere deeper in the call hierarchy and will not be seen from the overview.

Remote applications

For remote applications, the process is very similar but it requires setting up a JMX connection to the Java process to be monitored.

Enable the JMX port on your application. This is outside the scope of this article, but you can check the official Oracle documentation for more details: Monitoring and Management Using JMX Technology
Right click on the “Remote” tab in the left hand screen.
Select “Add Remote Host”
Fill in the host name. Most of the times, this will be sufficient but depending on how you’ve enabled JMX on the remote process, you may need to also check the “Advanced Settings” tab.
Right click on the newly added host and select “Add JMX connection”.
Fill in the connection details (including port number) and the display name. If you’re application is deployed using multiple processes, you should enter a descriptive display name.
Double click on the newly added JMX connection.
From this point onwards, the steps are identical with the ones from “Local applications”.

How to Interpet the Data

This depends a lot on the actual issues that you’re trying to investigate and the application architecture. For example, if you have a desktop application with a fixed amount of threads, the call tree with the break down per thread may be useful.

However, if you’re working on a web application with a variable number of threads, it will probably be hard to figure out what’s happening. In this case, you should probably start from thet “Hot spots” tab and dig deeper from there.

Difference between “Self Time” and “Self Time (CPU)”

VisualVM reports two metrics related to the duration, but there is a significant difference between them:

self time — counts the total time spent in that method, including the amount of time spent on locks or other blocking behaviour
self time (cpu) — counts the total time spent in that method, excluding the amount of time the thread was blocked

From here, you will need to decide on what you want to focus,

if you want to focus on optimising the multithreaded interactions, then you should aim for the self time values including the time the threads were blocked
if you’re interested in the overall performace and not care too much about the multithreaded interactions, then should focus solely on the self time (cpu).

Be careful though on how you interpret your results. If you have a thread that keeps a connection open, most likely you will see some very large numbers for the self time. This is normal and it’s not issue.

Источник

Что такое sampling java

Sampling of one variable. Samplings are often used to represent independent variables for sampled functions. They describe the values at which a function is sampled. For efficiency, and to guarantee a unique mapping from sample value to function value, we restrict samplings to be strictly increasing. In other words, no two samples have equal value, and sample values increase with increasing sample index. Samplings are either uniform or non-uniform. Uniform samplings are represented by a sample count n, a sampling interval d, and a first sample value f. Non-uniform samplings are represented by an array of sample values. All sample values are computed and stored in double precision. This double precision can be especially important in uniform samplings, where the sampling interval d and first sample value f may be used to compute values for thousands of samples, in loops like this one:

 int n = sampling.getCount(); double d = sampling.getDelta(); double f = sampling.getFirst(); double v = f; for (int i=0; i

In each iteration of the loop above, the sample value v is computed by accumulating the sampling interval d. This computation is fast, but it also yields rounding error that can grow quadratically with the number of samples n. If v were computed in single (float) precision, then this rounding error could exceed the sampling interval d for as few as n=10,000 samples. If accumulating in double precision is insufficient, a more accurate and more costly way to compute sample values is as follows:

With this computation of sample values, rounding errors can grow only linearly with the number of samples n. Two samplings are considered equivalent if their sample values differ by no more than the sampling tolerance. This tolerance may be specified, as a fraction of the sampling interval, when a sampling is constructed. Alternatively, a default tolerance may be used. When comparing two samplings, the smaller of their tolerances is used. A sampling is immutable. New samplings can be constructed by applying various transformations (e.g., shifting) to an existing sampling, but an existing sampling cannot be changed. Therefore, multiple sampled functions can safely share the same sampling.

Field Summary

Источник