Realmente necesito un poco de ayuda aquí:
Estamos utilizando Spark3.1.2 uso independiente de clúster. Desde que empecé con el s3 directorio sujetos del delito, nuestra chispa trabajos de la estabilidad y el rendimiento creció de manera significativa!
Últimamente, sin embargo, estamos completamente desconcertado a la solución de problemas de este s3a directorio committer cuestión de días, y me pregunto si usted tiene alguna idea de lo que está pasando?
Nuestra chispa trabajos no porque de Java OOM (o más bien el límite de procesos) error:
An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
at java.base/java.lang.Thread.start0(Native Method)
at java.base/java.lang.Thread.start(Thread.java:803)
at java.base/java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:937)
at java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1343)
at java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:118)
at java.base/java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:714)
at org.apache.spark.rpc.netty.DedicatedMessageLoop.$anonfun$new$1(MessageLoop.scala:174)
at org.apache.spark.rpc.netty.DedicatedMessageLoop.$anonfun$new$1$adapted(MessageLoop.scala:173)
at scala.collection.immutable.Range.foreach(Range.scala:158)
at org.apache.spark.rpc.netty.DedicatedMessageLoop.<init>(MessageLoop.scala:173)
at org.apache.spark.rpc.netty.Dispatcher.liftedTree1$1(Dispatcher.scala:75)
at org.apache.spark.rpc.netty.Dispatcher.registerRpcEndpoint(Dispatcher.scala:72)
at org.apache.spark.rpc.netty.NettyRpcEnv.setupEndpoint(NettyRpcEnv.scala:136)
at org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:231)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:394)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:189)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:277)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:458)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:238)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.base/java.lang.Thread.run(Thread.java:834)
Chispa Hilo de Volcado muestra más de 5000 sujetos del delito hilos en la chispa conductor! He aquí un ejemplo:
Thread ID Thread Name Thread State Thread Locks
1047 s3-committer-pool-0 WAITING
1449 s3-committer-pool-0 WAITING
1468 s3-committer-pool-0 WAITING
1485 s3-committer-pool-0 WAITING
1505 s3-committer-pool-0 WAITING
1524 s3-committer-pool-0 WAITING
1529 s3-committer-pool-0 WAITING
1544 s3-committer-pool-0 WAITING
1549 s3-committer-pool-0 WAITING
1809 s3-committer-pool-0 WAITING
1972 s3-committer-pool-0 WAITING
1998 s3-committer-pool-0 WAITING
2022 s3-committer-pool-0 WAITING
2043 s3-committer-pool-0 WAITING
2416 s3-committer-pool-0 WAITING
2453 s3-committer-pool-0 WAITING
2470 s3-committer-pool-0 WAITING
2517 s3-committer-pool-0 WAITING
2534 s3-committer-pool-0 WAITING
2551 s3-committer-pool-0 WAITING
2580 s3-committer-pool-0 WAITING
2597 s3-committer-pool-0 WAITING
2614 s3-committer-pool-0 WAITING
2631 s3-committer-pool-0 WAITING
2726 s3-committer-pool-0 WAITING
2743 s3-committer-pool-0 WAITING
2763 s3-committer-pool-0 WAITING
2780 s3-committer-pool-0 WAITING
2819 s3-committer-pool-0 WAITING
2841 s3-committer-pool-0 WAITING
2858 s3-committer-pool-0 WAITING
2875 s3-committer-pool-0 WAITING
2925 s3-committer-pool-0 WAITING
2942 s3-committer-pool-0 WAITING
2963 s3-committer-pool-0 WAITING
2980 s3-committer-pool-0 WAITING
3020 s3-committer-pool-0 WAITING
3037 s3-committer-pool-0 WAITING
3055 s3-committer-pool-0 WAITING
3072 s3-committer-pool-0 WAITING
3127 s3-committer-pool-0 WAITING
3144 s3-committer-pool-0 WAITING
3163 s3-committer-pool-0 WAITING
3180 s3-committer-pool-0 WAITING
3222 s3-committer-pool-0 WAITING
3242 s3-committer-pool-0 WAITING
3259 s3-committer-pool-0 WAITING
3278 s3-committer-pool-0 WAITING
3418 s3-committer-pool-0 WAITING
3435 s3-committer-pool-0 WAITING
3452 s3-committer-pool-0 WAITING
3469 s3-committer-pool-0 WAITING
3486 s3-committer-pool-0 WAITING
3491 s3-committer-pool-0 WAITING
3501 s3-committer-pool-0 WAITING
3508 s3-committer-pool-0 WAITING
4029 s3-committer-pool-0 WAITING
4093 s3-committer-pool-0 WAITING
4658 s3-committer-pool-0 WAITING
4666 s3-committer-pool-0 WAITING
4907 s3-committer-pool-0 WAITING
5102 s3-committer-pool-0 WAITING
5119 s3-committer-pool-0 WAITING
5158 s3-committer-pool-0 WAITING
5175 s3-committer-pool-0 WAITING
5192 s3-committer-pool-0 WAITING
5209 s3-committer-pool-0 WAITING
5226 s3-committer-pool-0 WAITING
5395 s3-committer-pool-0 WAITING
5634 s3-committer-pool-0 WAITING
5651 s3-committer-pool-0 WAITING
5668 s3-committer-pool-0 WAITING
5685 s3-committer-pool-0 WAITING
5702 s3-committer-pool-0 WAITING
5722 s3-committer-pool-0 WAITING
5739 s3-committer-pool-0 WAITING
6144 s3-committer-pool-0 WAITING
6167 s3-committer-pool-0 WAITING
6289 s3-committer-pool-0 WAITING
6588 s3-committer-pool-0 WAITING
6628 s3-committer-pool-0 WAITING
6645 s3-committer-pool-0 WAITING
6662 s3-committer-pool-0 WAITING
6675 s3-committer-pool-0 WAITING
6692 s3-committer-pool-0 WAITING
6709 s3-committer-pool-0 WAITING
7049 s3-committer-pool-0 WAITING
Esto es considerando que nuestra configuración no se permiten más de 100 hilos... O no entendemos algo...
Aquí está nuestra configuraciones y ajustes:
fs.s3a.threads.max 100
fs.s3a.connection.maximum 1000
fs.s3a.committer.threads 16
fs.s3a.max.total.tasks 5
fs.s3a.committer.name directory
fs.s3a.fast.upload.buffer disk
io.file.buffer.size 1048576
mapreduce.outputcommitter.factory.scheme.s3a - org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory
Ya habíamos probado diferentes versiones de la chispa de Hadoop en la nube de la biblioteca, pero el problema es constantemente la misma.
Realmente apreciaríamos si usted nos puede apuntar en la dirección correcta
2