02. Mat 클래스 structure

티스토리 뷰

영상처리/OpenCV

02. Mat 클래스 structure

빠리빵 2019. 4. 1. 23:06

opencv의 Mat class에 대해 간략하게 볼 수 있는 예제 코드를 작성해보았다.

#include <iostream>
#include <opencv2/core.hpp>
#include <opencv2/highgui.hpp>

using namespace std;
using namespace cv;

// test function that creates an image
Mat getGrayImage() {
	// create image 
	Mat image(Size(500, 500), CV_8U, 50);

	// fucntion instruction
	/** @overload
	@param rows Number of rows in a 2D array.
	@param cols Number of columns in a 2D array.
	@param type Array type. Use CV_8UC1, ..., CV_64FC4 to create 1-4 channel matrices, or
	CV_8UC(n), ..., CV_64FC(n) to create multi-channel (up to CV_CN_MAX channels) matrices.
	@param s An optional value to initialize each matrix element with. To set all the matrix elements to
	the particular value after the construction, use the assignment operator
	Mat::operator=(const Scalar& value) .
	*/
	// Mat(int rows, int cols, int type, const Scalar& s);
	
	// return it
	return image;
}

int main() {
	// create a new image mad of 50 rows and 320 columns
	Mat image(50, 320, CV_8U, 50);
	namedWindow("image");
	imshow("image", image); // show the image
	waitKey(0);	// wait for a key pressed

	// re-allocate a new image
	image.create(Size(200, 200), CV_8U);
	image = 255;
	namedWindow("image_allocated");
	imshow("image_allocated", image);	// show the image
	waitKey(0);	// wait for a key pressed

	//create a red color image
	// channel order is BGR
	Mat image_red(Size(240, 320), CV_8UC3, Scalar(0, 0, 255));
	// or
	// Mat image_red(Size(240, 320), CV_8UC3);
	// image_red = Scalar(0, 0, 255);
	namedWindow("image_red");
	imshow("image_red", image_red);
	waitKey(0);

	// read an image
	Mat lena = imread("lena.tif");
	namedWindow("lena");
	imshow("lena", lena);
	waitKey(0);

	// all these images point to the same data block
	Mat copiedLena(lena);
	image = lena;

	// these images are new copies of the source image
	Mat newCopiedLena;
	lena.copyTo(newCopiedLena);
	Mat newCopiedLena2;
	newCopiedLena2 = lena.clone();

	// transform the image for testing
	flip(lena, lena, -1);
	
	// check which images have been affected by the processing
	imshow("lena", lena);
	imshow("image", image);
    	imshow("copiedLena", copiedLena);
	imshow("newCopiedLena", newCopiedLena);
	imshow("newCopiedLena2", newCopiedLena2);
	waitKey(0);

	// get a gray-level image from a function
	Mat gray = getGrayImage();

	imshow("grayImage", gray);	
	waitKey(0);

	// read the image in gray scale
	Mat gray_image = imread("lena.tif", CV_LOAD_IMAGE_GRAYSCALE);
	Mat gray_image_float;
	gray_image.convertTo(gray_image_float, CV_32F, 1 / 255.0, 0.0);	// CV_32F -> data range : 0.0 ~ 1.0 

	imshow("gray_image_float", gray_image_float);	
	waitKey(0);

	system("pause");
}

크게 다음과 같은 관점에서 살펴보면 되겠다.
1. Mat class의 객체를 생성하는 방법, type과 value

2. 복사하는 방법 및 실제 내용(값, 주소 복사)

3. 다른 type으로의 convert

1. Mat class에 대해 살펴본다.

	// create a new image mad of 50 rows and 320 columns
	Mat image(50, 320, CV_8U, 50);
	namedWindow("image");
	imshow("image", image); // show the image
	waitKey(0);	// wait for a key pressed

	// re-allocate a new image
	image.create(Size(200, 200), CV_8U);
	image = 255;
	namedWindow("image_allocated");
	imshow("image_allocated", image);	// show the image
	waitKey(0);	// wait for a key pressed

	//create a red color image
	// channel order is BGR
	Mat image_red(Size(240, 320), CV_8UC3, Scalar(0, 0, 255));
	// or
	// Mat image_red(Size(240, 320), CV_8UC3);
	// image_red = Scalar(0, 0, 255);
	namedWindow("image_red");
	imshow("image_red", image_red);
	waitKey(0);

Mat class는 Mat(int rows, int cols, int type, const Scalar& s) 형태로 이루어져 있다.

따라서 Mat image(50, 320, CV_8U, 50) 의 경우는 row가 50, col이 320인 Matrix를 만드는 것이다.

CV_8U의 경우는 영상이 이루어질 Data type을 의미한다.

8U에서 8은 8bit(1byte)를 의미하며, U는 unsigned를 의미한다. (픽셀 하나가 0 ~ 255의 값을 갖는 unsigned)

이 외에도 아래와 같은 type들이 존재한다. (너무 길어서 잘랐지만, 아래로 매우 많다.)

결국, CV 뒤에 나오는 숫자는 픽셀 당 bit수를 나타내고, 그 뒤에 나오는 알파벳은 data의 type(unsigned, signed, float ...)을 의미한다.

더 아래 부분의 코드를 보면 CV_8UC3 라는 것도 등장한다. C3이 추가로 붙었는데, 이것은 channel의 값을 의미한다. 따라서 CV_8UC3은 하나의 픽셀이 unsigned 8bit * 3(R, G, B) 로 표현되는 data라 생각할 수 있다. (CV_8U는 Gray scale 영상에 맞는 type이라 할 수 있다.)

다시 코드로 돌아가서 Mat image(50, 320, CV_8U, 50)는 50 * 320 size를 갖는 하나의 픽셀이 1channel 8bit unsigned로 이루어진 영상이며, 50은 Scalar 값을 의미한다. (0 ~ 255 range에서의 50의 value를 갖는 gray)

따라서 이 영상을 화면에 띄우게 된다면 다음과 같다.

image.create(Size(200, 200), CV_8U)에서 create 함수를 통해서 이미 할당했거나 아직 할당하지 않은 객체의 정보를 정할 수 있다.

앞과 다르게 row, col을 Size(200, 200)으로 변경하였고, 개인적으로 이것이 더 가독성이 증가하는 것 같다.

image = 255를 거치게 되면 새롭게 변경된 size에 값은 255(가장 밝은)을 갖는 영상이 나온다. 따라서 결과는 다음과 같다.

Mat image_red(Size(240, 320), CV_8UC3, Scalar(0, 0, 255))도 비슷하게 해석할 수 있다.

Data type은 CV_8UC3로 이번에는 채널이 3개로 늘어났다.(RGB)

값의 경우 Scalar(0, 0, 255)로 지정하여 red에 8bit 기준 최대 값인 255를 적용하였다. (순서대로 B, G, R)

따라서 결과는 다음과 같다.

추가로

Mat image_red(Size(240, 320), CV_8UC3, Scalar(0, 0, 255))는

↓

Mat image_red(Size(240, 320), CV_8UC3);

image_red = Scalar(0, 0, 255);
로 변경할 수도 있다. (생성자가 대부분은 작성되어 있어서, size만 결정하여 우선 생성할 수 있다.)

※ 만약 data를 Scalar(0, 0, 255)가 아닌 255를 넣게 된다면 B channel에만 255를 넣는 것이 되어 전혀 다른 결과 이미지가 된다. 항상 type을 고려해야 할 것 같다.

2. 복사하는 방법 및 실제 내용(값, 주소 복사)에 대해 알아본다.

우선 call by reference, value와 비슷한 개념처럼 openCV에도 존재하는 것 같다.

 	// read an image
	Mat lena = imread("lena.tif");
	namedWindow("lena");
	imshow("lena", lena);
	waitKey(0);

	// all these images point to the same data block
	Mat copiedLena(lena);
	image = lena;

	// these images are new copies of the source image
	Mat newCopiedLena;
	lena.copyTo(newCopiedLena);
	Mat newCopiedLena2;
	newCopiedLena2 = lena.clone();

	// transform the image for testing
	flip(lena, lena, -1);
	
	// check which images have been affected by the processing
	imshow("lena", lena);
    	imshow("copiedLena", copiedLena);
	imshow("image", image);
	imshow("newCopiedLena", newCopiedLena);
	imshow("newCopiedLena2", newCopiedLena2);
	waitKey(0);

	// get a gray-level image from a function
	Mat gray = getGrayImage();
	imshow("grayImage", gray);	
	waitKey(0);

코드는 영상을 불러온 후, 다양한 방식으로 원본 영상을 복사할 것이다. 그 후 원본 영상을 변경하였을 때, 다른 영상도 바뀌는지 확인한다.

Mat lena = imread("lena.tif")를 통해 원본 영상을 read한다.

방법 1) all these images point to the same data block

Mat copiedLena(lena)는 생성자를 통해서 copiedLena에 복사했다.
image = lena는 직접 대입, image는 이전에 image.create(Size(200, 200), CV_8U)로 사용

방법 2) these images are new copies of the source image

Mat newCopiedLena;
lena.copyTo(newCopiedLena);
Mat newCopiedLena2;
newCopiedLena2 = lena.clone();

위의 두 가지 방법을 사용하여 복사 후 원본 이미지를 flip한다.

flip(lena, lena, -1)

// flip의 형태는 다음과 같다 => void flip(InputArray src, OutputArray dst, int flipCode);

// flipCode에 따라 좌, 우, both flip 달라진다.

결과는 다음과 같다.

방법 1의 경우 원본과 동일하게 flip된 것으로 바뀌었으며(각 객체는 동일한 메모리 주소를 가리킨다.)

방법 2의 경우 flip 전 이미지이다. (value가 복사되어 원본의 수정에 영향을 받지 않는다.)

추후에 이미지를 수정할 일이 많을텐데, 원본의 보존을 원한다면 copyTo, clone 함수를 이용해야 할 것이다.

3. 다른 type으로의 convert

영상을 다른 type으로 변환할 일이 발생할 수 있다. 아래와 같은 방법을 사용한다.

// read the image in gray scale
Mat gray_image = imread("lena.tif", CV_LOAD_IMAGE_GRAYSCALE);
Mat gray_image_float;
gray_image.convertTo(gray_image_float, CV_32F, 1 / 255.0, 0.0); // CV_32F -> data range : 0.0 ~ 1.0

영상을 gray scale로 read하였다. 따라서 CV_8U type일 것이다.

(※참고, IMREAD_GRAYSCALE = 0, //!< If set, always convert image to the single channel grayscale image)

convert 함수의 형태는 void convertTo( OutputArray m, int rtype, double alpha=1, double beta=0 )이다.

alpha는 value를 곱해주는 것이며, beta는 더해주는 것이라 생각하면 된다. 따라서 CV_8U (0 ~ 255) 데이터를 255로 나누어 (0.0 ~ 1.0) 사이의 데이터들로 변경한 것이다. data의 range만 변경되었지, 해당 data가 갖는 상대적인 value는 같으므로 출력해도 같은 영상이다.

대체 왜 이와같은 변환을 해야하는가? 궁금하여 검색해보니 stack overflow에 명쾌한 답변이 있었다.

CV_8U is unsigned 8bit/pixel - ie a pixel can have values 0-255, this is the normal range for most image and video formats.

CV_32F is float - the pixel can have any value between 0-1.0, this is useful for some sets of calculations on data - but it has to be converted into 8bits to save or display by multiplying each pixel by 255.

CV_32S is a signed 32bit integer value for each pixel - again useful of you are doing integer maths on the pixels, but again needs converting into 8bits to save or display. This is trickier since you need to decide how to convert the much larger range of possible values (+/- 2billion!) into 0-255

일반적으로는 CV_8U를 사용하나, 수학적인 계산 등을 사용할 시 32F 혹은 32S type이 필요한 경우가 있나보다.

이것으로 Mat class에 대해서 간단하게 알아보았으며, 다음부터는 불러온 이미지의 픽셀을 접근 및 조작하는 것을 진행하도록 하겠다.

'영상처리 > OpenCV' 카테고리의 다른 글

05. Scanning with iterator (0)	2019.04.15
04.Scanning with pointer (0)	2019.04.10
03. Accessing pixel values (0)	2019.04.07
01. 이미지 read, imread 함수 (1)	2019.03.28
00. OpenCV 설치 (0)	2019.03.26

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

TAG more

« 2024/05 »
일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

글 보관함

공부합시당

티스토리 뷰

02. Mat 클래스 structure

'영상처리 > OpenCV' 카테고리의 다른 글

티스토리툴바