Operation-level vision-based monitoring and documentation has drawn significant attention from construction practitioners and researchers. Our proposed method, named TD-CEDN, solves two important issues in this low-level vision problem: (1) learning. Given its axiomatic importance, however, we find that object contour detection is relatively under-explored in the literature. A complete decoder network setup is listed in Table. Owing to discarding the fully connected layers after pool5, higher resolution feature maps are retained while reducing the parameters of the encoder network significantly (from 134M to 14.7M). We trained our network using the publicly available Caffe[55] library and built it on the top of the implementations of FCN[23], HED[19], SegNet[25] and CEDN[13]. Note that we use the originally annotated contours instead of our refined ones as ground truth for unbiased evaluation. The encoder-decoder network is composed of two parts: encoder/convolution and decoder/deconvolution networks. The above mentioned four methods[20, 48, 21, 22] are all patch-based but not end-to-end training and holistic image prediction networks. The proposed multi-tasking convolutional neural network did not employ any pre- or postprocessing step. Our network is trained end-to-end on PASCAL VOC with refined ground truth from inaccurate polygon annotations, yielding much higher precision in object contour detection than previous methods. According to the results, the performances show a big difference with these two training strategies. We develop a deep learning algorithm for contour detection with a fully convolutional encoder-decoder network. A novel semantic segmentation algorithm by learning a deep deconvolution network on top of the convolutional layers adopted from VGG 16-layer net, which demonstrates outstanding performance in PASCAL VOC 2012 dataset. Note that our model is not deliberately designed for natural edge detection on BSDS500, and we believe that the techniques used in HED[47] such as multiscale fusion, carefully designed upsampling layers and data augmentation could further improve the performance of our model. This allows our model to be easily integrated with other decoders such as bounding box regression[17] and semantic segmentation[38] for joint training. By combining with the multiscale combinatorial grouping algorithm, our method can generate high-quality segmented object proposals, which significantly advance the state-of-the-art on PASCAL VOC (improving average recall from 0.62 to 0.67) with a relatively small amount of candidates (~1660 per image). We develop a deep learning algorithm for contour detection with a fully convolutional encoder-decoder network. Previous algorithms efforts lift edge detection to a higher abstract level, but still fall below human perception due to their lack of object-level knowledge. Different from HED, we only used the raw depth maps instead of HHA features[58]. We first examine how well our CEDN model trained on PASCAL VOC can generalize to unseen object categories in this dataset. Encoder-decoder architectures can handle inputs and outputs that both consist of variable-length sequences and thus are suitable for seq2seq problems such as machine translation. (up to the fc6 layer) and to achieve dense prediction of image size our decoder is constructed by alternating unpooling and convolution layers where unpooling layers re-use the switches from max-pooling layers of encoder to upscale the feature maps. Different from previous low-level edge detection, our algorithm focuses on detecting higher-level object contours. Concerned with the imperfect contour annotations from polygons, we have developed a refinement method based on dense CRF so that the proposed network has been trained in an end-to-end manner. Given that over 90% of the ground truth is non-contour. Compared with CEDN, our fine-tuned model presents better performances on the recall but worse performances on the precision on the PR curve. Recent works, HED[19] and CEDN[13], which have achieved the best performances on the BSDS500 dataset, are two baselines which our method was compared to. A deep learning algorithm for contour detection with a fully convolutional encoder-decoder network that generalizes well to unseen object classes from the same supercategories on MS COCO and can match state-of-the-art edge detection on BSDS500 with fine-tuning. Figure8 shows that CEDNMCG achieves 0.67 AR and 0.83 ABO with 1660 proposals per image, which improves the second best MCG by 8% in AR and by 3% in ABO with a third as many proposals. Different from previous low-level edge detection, our algorithm focuses on detecting higher-level object contours. The ground truth contour mask is processed in the same way. To address the problem of irregular text regions in natural scenes, we propose an arbitrary-shaped text detection model based on Deformable DETR called BSNet. Each image has 4-8 hand annotated ground truth contours. Different from previous low-level edge detection, our algorithm focuses on detecting higher-level object contours. Our network is trained end-to-end on PASCAL VOC with refined ground truth from inaccurate polygon annotations, yielding much higher precision in object contour detection than previous methods. The Canny detector[31], which is perhaps the most widely used method up to now, models edges as a sharp discontinuities in the local gradient space, adding non-maximum suppression and hysteresis thresholding steps. We choose this dataset for training our object contour detector with the proposed fully convolutional encoder-decoder network. We use the DSN[30] to supervise each upsampling stage, as shown in Fig. With the same training strategy, our method achieved the best ODS=0.781 which is higher than the performance of ODS=0.766 for HED, as shown in Fig. Recently deep convolutional networks[29] have demonstrated remarkable ability of learning high-level representations for object recognition[18, 10]. Different from previous low-level edge detection, our algorithm focuses on detecting higher-level object contours. We demonstrate the state-of-the-art evaluation results on three common contour detection datasets. Use this path for labels during training. 2 window and a stride 2 (non-overlapping window). The training set is denoted by S={(Ii,Gi)}Ni=1, where the image sample Ii refers to the i-th raw input image and Gi refers to the corresponding ground truth edge map of Ii. To prepare the labels for contour detection from PASCAL Dataset, run create_lables.py and edit the file to add the path of the labels and new labels to be generated. In the encoder part, all of the pooling layers are max-pooling with a 2. The experiments have shown that the proposed method improves the contour detection performances and outperform some existed convolutional neural networks based methods on BSDS500 and NYUD-V2 datasets. Given trained models, all the test images are fed-forward through our CEDN network in their original sizes to produce contour detection maps. We use the layers up to pool5 from the VGG-16 net[27] as the encoder network. Compared the HED-RGB with the TD-CEDN-RGB (ours), it shows a same indication that our method can predict the contours more precisely and clearly, though its published F-scores (the F-score of 0.720 for RGB and the F-score of 0.746 for RGBD) are higher than ours. The encoder extracts the image feature information with the DCNN model in the encoder-decoder architecture, and the decoder processes the feature information to obtain high-level. We find that the learned model generalizes well to unseen object classes from the same supercategories on MS COCO and can match state-of-the-art edge detection on BSDS500 with fine-tuning. During training, we fix the encoder parameters (VGG-16) and only optimize decoder parameters. Due to the asymmetric nature of image labeling problems (image input and mask output), we break the symmetric structure of deconvolutional networks and introduce a light-weighted decoder. Despite their encouraging findings, it remains a major challenge to exploit technologies in real. We proposed a weakly trained multi-decoder segmentation-based architecture for real-time object detection and localization in ultrasound scans. With CEDN, our fine-tuned model presents better performances on the recall but worse performances on the recall but worse performances on the precision on the PR curve. Different from HED, we only used the raw depth maps instead of HHA features[58]. We use the layers up to fc6 from VGG-16 net[45] as our encoder. Given trained models, all the test images are fed-forward through our CEDN network in their original sizes to produce contour detection maps. The encoder-decoder network is composed of two parts: encoder/convolution and decoder/deconvolution networks. We first examine how well our CEDN model trained on PASCAL VOC can generalize to unseen object categories in this dataset. For each training image, we randomly crop four 2242243 patches and together with their mirrored ones compose a 22422438 minibatch. Given trained models, all the test images are fed-forward through our CEDN network in their original sizes to produce contour detection maps. The proposed multi-tasking convolutional neural network did not employ any pre- or postprocessing step. We proposed a weakly trained multi-decoder segmentation-based architecture for Real-Time object detection and localization in ultrasound scans. Since we convert the fc6 to be convolutional, so we name it conv6 in our decoder. Note that we use the originally annotated contours instead of our refined ones as ground truth for unbiased evaluation. Our fine-tuned model achieved the best ODS F-score of 0.588. Encoder-decoder architectures can handle inputs and outputs that both consist of variable-length sequences and thus are suitable for seq2seq problems such as machine translation. We fine-tuned the model TD-CEDN-over3 (ours) with the NYUD training dataset. Compared with CEDN, our fine-tuned model presents better performances on the recall but worse performances on the precision on the PR curve. We find that the learned model generalizes well to unseen object classes from the same supercategories on MS COCO and can match state-of-the-art edge detection on BSDS500 with fine-tuning.